jOOQ / jOOL

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere. It adds tuple support, function support, and a lot of additional functionality around sequential Streams. The JDK 8's main efforts (default methods, lambdas, and the Stream API) were focused around maintaining backwards compatibility and implementing a functional API for parallelism.
http://www.jooq.org/products
Apache License 2.0
2.09k stars 168 forks source link

Add Seq.reducingMap() and Seq.reducePartially() methods #338

Open tlinkowski opened 6 years ago

tlinkowski commented 6 years ago

On several occasions, I've been looking for a method that would be an inverse of Stream.flatMap. I noticed that other people do too - here's an example question on StackOverflow.

However, there are quite many ways to define it (especially the criteria for such an inverse operation). Finally I've decided to stick to the well-understood reduction semantics, and I've come up with the following two method signatures (names to be discussed) that I'd like to propose for Seq:

1) reducePartially works like Stream.reduce(BinaryOperator), only it yields and restarts the reduction if conditionalAccumulator returns an empty Optional:

  public static Seq<T> reducePartially(BiFunction<? super T, ? super T, Optional<T>> conditionalAccumulator) {
    return reducingMap(Function.identity(), conditionalAccumulator);
  }

reducePartially is similar to StreamEx.collapse(BiPredicate,BinaryOperator), with the main difference being that the first argument to conditionalAccumulator is the result of previous reduction (exactly as in reduce).

2) reducingMap works like Stream.reduce(U, BiFunction, BinaryOperator), only it yields and restarts the reduction if conditionalAccumulator returns an empty Optional, and it has a different mechanism for providing initial Us (instead of U identity we have an initialMapper that converts the first T to the initial U):

  public static <U> Seq<U> reducingMap(Function<? super T, ? extends U> initialMapper,
          BiFunction<? super U, ? super T, Optional<U>> conditionalAccumulator);

reducingMap bears some resemblance to StreamEx.collapse(BiPredicate,Collector), but - again - it's much more a "reduce" than a "mark" + "merge".


Here are a few examples of how it is supposed to work:

  1. Seq.reducePartially

Example 1: Merge words ending with hyphens with the following word:

  1. Seq.reducingMap

Example 1: Join integers using a semicolon until 0 occurs:


I have this implemented (using a custom, lazy Spliterator) so I could provide a PR if the feature gets accepted.

lukaseder commented 6 years ago

Thank you very much for your suggestion. The functionality seems to be covered by Seq.grouped(). The advantage of the Seq.grouped() API is that it produces lazy evaluations of group values. In your case, you seem to want eager grouping, which requires consuming the entire stream before emitting output, so I'm not sure what your suggestion gains compared to calling collect()

tlinkowski commented 6 years ago

Thank you for your answer. I'll try to highlight the differences to make it more clear:

  1. Seq.grouped

    • works on the current T only (uses Function<T,K>)
    • evaluates lazily until next K is obtained
  2. My proposal

    • works on pairs of values, where each pair is:
      • a previously merged T and the current T (Seq.reducePartially: uses BiFunction<T, T, Optional<T>>)
      • an accumulator U and the current T (Seq.reducingMap: uses BiFunction<U, T, Optional<U>>)
    • evaluates lazily until next empty Optional is obtained (does not necessarily consume entire Stream)

Maybe the examples that I provided are too weak, because the conditions in those examples always test only one value (either a or b). Please, consider the example below and see that it cannot be solved using Seq.grouped:

Example 2 (Seq.reducePartially): Merge words on matching hyphens:


My real use case is much more complex than simple string concatenation, but it effectively uses something like canMergeWithPrevious and canMergeWithNext flags, where merging occurs only if both flags are true.

lukaseder commented 6 years ago

works on the current T only (uses Function<T,K>)

indeed, this could be improved with additional overloads

My real use case is much more complex than simple string concatenation, but it effectively uses something like canMergeWithPrevious and canMergeWithNext flags, where merging occurs only if both flags are true.

OK I think I understand your use case now. I'm not convinced about the naming of course, but I see the value of such a method. Will think about it this week.

tlinkowski commented 6 years ago

Did you find the time to think about it? :)

I'll understand if you want to wait until more people request this feature, though.

In the meantime, here's some explanation and further proposals concerning naming:

1) Seq<T> ???(BiFunction<T, T, Optional<T>> conditionalAccumulator)

I named it reducePartially because it's actually reduce applied to parts of the Stream. But since #296 proposes the words "chunks" for such parts, I think a much better name could be chunkedReduce.

Note that this proposed method assumes that:

  1. chunks are always at least one-element long,
  2. each chunk is processed as if using Stream.reduce(T identity, BinaryOperator<T> accumulator) where:
    1. T identity is the first element of a given chunk,
    2. BinaryOperator<T> accumulator is derived from BiFunction<T, T, Optional<T>> conditionalAccumulator (Optional.empty() means: start next chunk)

2) Seq<U> ???(Function<T, U> initialMapper, BiFunction<U, T, Optional<U>> conditionalAccumulator)

I named it reducingMap because it seems the opposite of flatMap:

But this analogy does not seem intuitive so I propose to name this method chunkedReduce too.

Note that this proposed method assumes that:

  1. chunks are always at least one-element long,
  2. each chunk is processed as if using Stream.reduce(U identity, BiFunction<U,T,U> accumulator, *) where:
    1. U identity is obtained using Function<T, U> initialMapper from the first element of a given chunk,
    2. BiFunction<U,T,U> accumulator is derived from BiFunction<U, T, Optional<U>> conditionalAccumulator (Optional.empty() means: start next chunk)

PS. Please, remove the "Feedback pending" label unless I am to give some more feedback :)

lukaseder commented 6 years ago

Did you find the time to think about it? :)

No, sorry

I'll understand if you want to wait until more people request this feature, though.

Yes, definitely. I already regret 1-2 prematurely added features to jOOλ. I prefer not to add more things without enough demand from the community

In the meantime, here's some explanation and further proposals concerning naming...

One reason why I'm often not so quick (or keen) to respond to these things is because of the complexity of the feature. It takes quite a bit of time to think about why this could be useful (not just how it works). This is, to me, a strong indicator that it is not generally useful.

PS. Please, remove the "Feedback pending" label unless I am to give some more feedback :)

Sure.