apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.58k stars 1.01k forks source link

IBM J9 JVM bug causes test failure in Kuromoji's TestExtended [LUCENE-4735] #5800

Closed asfimport closed 11 years ago

asfimport commented 11 years ago

Note that this is not a Lucene bug; it's a JVM bug, but I wanted to track it in Lucene as well in case others hit it.

I noticed this test frequently fails when running under IBM's J9 JVM (1.6.0) ... and I finally tracked down the root cause and made a small test case, eg on trunk, rev 1439839, if you run:

  ant test -Dtestcase=TestExtendedMode -Dtestmethod=testRandomHugeStrings -Dtests.seed=26D2B352E9603950

it fails with this:

[junit4:junit4]    > Throwable #1: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=4272,endOffset=4271
[junit4:junit4]    >    at __randomizedtesting.SeedInfo.seed([26D2B352E9603950:BEF1D491B7168518]:0)
[junit4:junit4]    >    at org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl.setOffset(OffsetAttributeImpl.java:45)
[junit4:junit4]    >    at org.apache.lucene.analysis.ja.JapaneseTokenizer.incrementToken(JapaneseTokenizer.java:463)
[junit4:junit4]    >    at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:635)
[junit4:junit4]    >    at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:546)
[junit4:junit4]    >    at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:447)
[junit4:junit4]    >    at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:375)
[junit4:junit4]    >    at org.apache.lucene.analysis.ja.TestExtendedMode.testRandomHugeStrings(TestExtendedMode.java:76)

I've seen other analyzer tests fail with similar exceptions.

I dug in, and found that there's a bug in TreeMap.subMap, and it's easily reproduced with a small test case, which I'll attach. I'll also open an issue with J9.

I also found a workaround that seems to sidestep the bug for Lucene.


Migrated from LUCENE-4735 by Michael McCandless (@mikemccand), resolved Jan 31 2013 Attachments: LUCENE-4735.patch, TestTreeMap2.java

asfimport commented 11 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Patch w/ workaround for Lucene. If you use TreeMap.lowerEntry instead of TreeMap.subMap.lastKey then it seems to sidestep the issue.

asfimport commented 11 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Simple standalone test... if you run with J9 1.6, or at least with this version:

Java(TM) SE Runtime Environment (build pxa6460sr9fp2ifix-20111111_05(SR9 FP2+IV03622+IV02378+IZ99243+IZ97310+IV00707))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr9-20111111_94827 (JIT enabled, AOT enabled)
J9VM - 20111111_094827
JIT  - r9_20101028_17488ifx45
GC   - 20101027_AA)
JCL  - 20110727_04

Then the test will print:

FAILED: subMap.lastKey=4545 but should be 4576

But with Oracle Java 1.6 it prints "OK".

asfimport commented 11 years ago

Robert Muir (@rmuir) (migrated from JIRA)

LOL

+1 to commit the workaround, its just MockCharFilter (which is not fast!)

asfimport commented 11 years ago

Commit Tag Bot (migrated from JIRA)

[trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1440137

LUCENE-4735: workaround IBM J9 JVM bug

asfimport commented 11 years ago

Commit Tag Bot (migrated from JIRA)

[branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1440143

LUCENE-4735: workaround IBM J9 JVM bug

asfimport commented 11 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK, this was fixed in IBM's JVM, sometime between this version:

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr9fp2ifix-20111111_05(SR9 FP2+IV03622+IV02378+IZ99243+IZ97310+IV00707))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr9-20111111_94827 (JIT enabled, AOT enabled)
J9VM - 20111111_094827
JIT  - r9_20101028_17488ifx45
GC   - 20101027_AA)
JCL  - 20110727_04

and this one:

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr12-20121025_01(SR12))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr12-20121024_126067 (JIT enabled, AOT enabled)
J9VM - 20121024_126067
JIT  - r9_20120914_26057
GC   - 20120928_AA)
JCL  - 20121014_01