Closed cpoerschke closed 1 year ago
Well, this has been one hell of a treasure hunt.
Based on the stack trace, the addressable problem starts here: https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1235
After adding a bunch of log statements and blowing up the GC..
What are learned from the logs and is evident in the error message, that we are trying to read term bytes longer than the array.length. At first glance this does not seem possible. Could there be some data corruption issue in input.readBytes(term.bytes, prefixLength, suffixLength);
from
https://github.com/apache/lucene/blob/caeabf39309a91997d361b4104bda105d16ae720/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java#L1105
I'm not sure what could be going on but it seems potentially like a serious issue. Have we previously observed data corruption issues on input streams? I'm really stumped as to where the 813 (or any number > 80
is coming from). If it's important or helpful, I could look into it more.
Looks like this is tracking the same issue I opened back in #11631. I'm going to close out my older issue in favor of this, since it has more detail and is being worked. Hooray!
Fixed by #12555, thanks @epotyom !
Description
Please see https://lists.apache.org/thread/t5hgzv0q177nf8qqyz6h9qhhvwp366gp or https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/8706 for full details.
Gradle command to reproduce
./gradlew test --tests TestGroupFacetCollector.testRandom -Dtests.seed=173F7C2A08C39662 -Dtests.multiplier=3 -Dtests.locale=ru-KZ -Dtests.timezone=America/Mazatlan -Dtests.asserts=true -Dtests.file.encoding=UTF-8