apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.6k stars 1.01k forks source link

org.apache.lucene.search.grouping.TestGroupFacetCollector.testRandom fails reproducibly #12167

Closed cpoerschke closed 1 year ago

cpoerschke commented 1 year ago

Description

Please see https://lists.apache.org/thread/t5hgzv0q177nf8qqyz6h9qhhvwp366gp or https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/8706 for full details.

Error Message:
java.lang.IndexOutOfBoundsException: Range [0, 0 + 813) out of bounds for length 80

Stack Trace:
java.lang.IndexOutOfBoundsException: Range [0, 0 + 813) out of bounds for length 80
    at __randomizedtesting.SeedInfo.seed([173F7C2A08C39662:65735925B9A32011]:0)
    at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
    at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckFromIndexSize(Preconditions.java:82)
    at java.base/jdk.internal.util.Preconditions.checkFromIndexSize(Preconditions.java:361)
    at java.base/java.util.Objects.checkFromIndexSize(Objects.java:411)
    at java.base/java.nio.HeapByteBuffer.get(HeapByteBuffer.java:180)
    at org.apache.lucene.core@9.6.0-SNAPSHOT/org.apache.lucene.store.ByteBuffersDataInput.readBytes(ByteBuffersDataInput.java:155)
    at org.apache.lucene.core@9.6.0-SNAPSHOT/org.apache.lucene.store.ByteBuffersIndexInput.readBytes(ByteBuffersIndexInput.java:85)
    at org.apache.lucene.test_framework@9.6.0-SNAPSHOT/org.apache.lucene.tests.store.MockIndexInputWrapper.readBytes(MockIndexInputWrapper.java:148)
    at org.apache.lucene.core@9.6.0-SNAPSHOT/org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$TermsDict.decompressBlock(Lucene90DocValuesProducer.java:1235)
    at org.apache.lucene.core@9.6.0-SNAPSHOT/org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$TermsDict.next(Lucene90DocValuesProducer.java:1093)
    at org.apache.lucene.search.grouping.TermGroupFacetCollector$MV$SegmentResult.nextTerm(TermGroupFacetCollector.java:437)
    at org.apache.lucene.search.grouping.GroupFacetCollector.mergeSegmentResults(GroupFacetCollector.java:97)
    at org.apache.lucene.search.grouping.TestGroupFacetCollector.testRandom(TestGroupFacetCollector.java:429)
...

Gradle command to reproduce

./gradlew test --tests TestGroupFacetCollector.testRandom -Dtests.seed=173F7C2A08C39662 -Dtests.multiplier=3 -Dtests.locale=ru-KZ -Dtests.timezone=America/Mazatlan -Dtests.asserts=true -Dtests.file.encoding=UTF-8

MarcusSorealheis commented 1 year ago

Well, this has been one hell of a treasure hunt.

Based on the stack trace, the addressable problem starts here: https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1235

After adding a bunch of log statements and blowing up the GC..

What are learned from the logs and is evident in the error message, that we are trying to read term bytes longer than the array.length. At first glance this does not seem possible. Could there be some data corruption issue in input.readBytes(term.bytes, prefixLength, suffixLength); from https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1105

I'm not sure what could be going on but it seems potentially like a serious issue. Have we previously observed data corruption issues on input streams? I'm really stumped as to where the 813 (or any number > 80 is coming from). If it's important or helpful, I could look into it more.

gsmiller commented 1 year ago

Looks like this is tracking the same issue I opened back in #11631. I'm going to close out my older issue in favor of this, since it has more detail and is being worked. Hooray!

zhaih commented 1 year ago

Fixed by #12555, thanks @epotyom !