nus-cs2030 / 2021-s1

27 stars 48 forks source link

Stream's 3 argument reduce method #547

Open kheekheekhee opened 3 years ago

kheekheekhee commented 3 years ago

Hi, can someone explain how Stream's 3 argument reduce method works with identity, accumulator and combiner? And how to ensure that it gives the correct answer in the correct order when done in parallel and how to check when it doesn't give the correct answer?

rcdl2222 commented 3 years ago

I found this website that may be helpful:

https://ocpj8.javastudyguide.com/ch18.html

Octanis0 commented 3 years ago

If you have a stream containing objects of type T, then the reduce method will take in :

  1. an identity of some type U,
  2. an accumulator (two inputs of type U and T, output of type U),
  3. and a combiner (two inputs both of type U, output of type U).

The sequence of events inside reduce roughly goes as follows:

  1. At the beginning of the reduce operation, you have one type U object (identity) and the rest are all of type T (from your stream).
  2. The reduce method starts off by using the accumulator on the identity and some of your stream elements to create intermediate U objects.
  3. As more type U intermediate objects are generated from the accumulator, the combiner gets to work combining them together.

The endgoal of reduce is to combine/accumulate everything until you have one final type U object. Regardless of whether the setting was sequential or parallel, the accumulator and combiner, when invoked, will take up whatever available objects that fit their respective input types and complete the operation. They are usually invoked in an arbitrary order, as long as there are enough available objects to use it on.

Another thing to note is that your combiner may combine an intermediate U object with the identity. It does not distinguish between the two.

This is why your accumulator and combiner have to be associative, so that your end result is still the same regardless of the sequence of accumulation/combination. You also have to make sure to choose an identity carefully so that an intermediate object that gets combined with the identity doesn't change.

How to ensure these conditions are met is quite context-specific... I'm not sure if there's really a generic way to check for this.