nus-cs2030 / 2021-s1

27 stars 48 forks source link

When does reduce() produce different outputs in parallel streaming? #549

Open limeugene opened 3 years ago

limeugene commented 3 years ago

Hi, having trouble understanding reduce for parallel streaming.... given that subtraction is non-associative as 3 - (2 - 1) != (3 - 2) - 1, why is reduce() producing the same output every time? When will reduce produce undeterministic results? thanks in advance :)

image

image

itsyme commented 3 years ago

I think its because it takes the same thread to do the same tasks every time

limeugene commented 3 years ago

@itsyme Hi, would you mind elaborating? Does this mean that I'm not initiating parallel streaming properly in my code? thanks!

bentanjunrong commented 3 years ago

It has to do with the .parallel() you run. Since your reduce takes in a non associative function, you are not sure of the order of completion of the parallel tasks, so the order of input for the reductions is indeterministic.

pikasean commented 3 years ago

Have you tried running the function more than twice? If you've only ran it twice, it could be pure coincidence.

itsyme commented 3 years ago

It's just my thought but I have seen this in my jshell as well. I think it just happens such that the way it runs is in the same order giving the same result.

GJ0407790 commented 3 years ago

I think can print out the thread name as well to check whether there is only one thread operating or multiple of them.