jOOQ / jOOL

jOOλ - The Missing Parts in Java 8 jOOλ improves the JDK libraries in areas where the Expert Group's focus was elsewhere. It adds tuple support, function support, and a lot of additional functionality around sequential Streams. The JDK 8's main efforts (default methods, lambdas, and the Stream API) were focused around maintaining backwards compatibility and implementing a functional API for parallelism.
http://www.jooq.org/products
Apache License 2.0
2.08k stars 167 forks source link

ArrayIndexOutOfBoundsException when combining cycle() with sorted() #202

Closed lukaseder closed 8 years ago

lukaseder commented 8 years ago

The following usage:

Seq.of(1,2,3).sorted(Comparator.naturalOrder()).cycle(3).sorted().forEach(System.out::println);

Results in:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3
at java.util.stream.SortedOps$SizedRefSortingSink.accept(SortedOps.java:362)
at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1351)
at org.jooq.lambda.Seq.lambda$cycle$423(Seq.java:2689)
at org.jooq.lambda.Seq$$Lambda$1/1595428806.tryAdvance(Unknown Source)
at org.jooq.lambda.SeqUtils$1.tryAdvance(SeqUtils.java:60)
at java.util.Spliterator.forEachRemaining(Spliterator.java:326)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.jooq.lambda.SeqImpl.forEach(SeqImpl.java:147)

As reported by @amaembo: https://github.com/jOOQ/jOOL/issues/189#issuecomment-190124632

The reason for this is that our Spliterator implementation delegates estimateSize() and characteristics() to the delegate spliterator, which isn't correct in general. For instance, in the event of cycle(3), we must not have Spliterator.SIZED along with a size estimate of 3, because that would allocate an internal array of length 3, which is too small when we really need 9 values.

lukaseder commented 8 years ago

A quick fix will simply not estimate sizes and remove all characteristics (similar to when transforming an Iterator into a Spliterator). We can still optimise later - correctness is more important right now.

lukaseder commented 8 years ago

(thanks again @amaembo!)

amaembo commented 8 years ago

At least preserve ORDERED. Probably it's not so important in your case, but StreamSupport.stream(Seq.of(1,2,3).cycle(3).spliterator(), true).limit(1).forEach(System.out::println) may unexpectedly work incorrectly after this. Use simply delegate.characteristics() & ORDERED.

lukaseder commented 8 years ago

Huh, good point. I hadn't thought about preserving the flag on presence. I suspect that similar preservations could be implemented for NONNULL and IMMUTABLE (in most cases)

lukaseder commented 8 years ago

@amaembo: Hmm, judging from the JDK source code, this ORDERED flag is currently used only for parallel stream processing (unless I'm overlooking something), so it doesn't change much for jOOλ - at least not in JDK 8

amaembo commented 8 years ago

IMMUTABLE and CONCURRENT could also be preserved, though they do not affect the behavior of standard Stream API. Keeping NONNULL could be incorrect if you change the elements somehow (for example, perform mapping). It also currently not used by standard Stream API, but may be used by other third-party libs for some optimizations (including StreamEx ;-)). Something like StreamEx.of(Seq.of("a", null, "b").cycle(3)).parallel().distinct(2).toList() may fail if you incorrectly report NONNULL as parallel distinct(n) cuts edges for non-null source. So better not to report it if you're not 100% sure.

amaembo commented 8 years ago

Right, ORDERED is only for parallel processing (at least now). However it's possible to create parallel stream from your Seq using StreamSupport.stream(Seq.something().spliterator(), true).

lukaseder commented 8 years ago

StreamEx.of(Seq.of("a", null, "b").cycle(3)).parallel().distinct(2).toList()

That looks nifty! :)

So better not to report it if you're not 100% sure.

Exactly my thinking. Indeed, I haven't fully understood all of Spliterator's SPI features. Specifically this one here (the original one) with the Comparator seems a bit obscure to me.

StreamSupport.stream(Seq.something().spliterator(), true)

I cannot keep people from shooting themselves in their feet :)