aymkam / lucene-gosen

Automatically exported from code.google.com/p/lucene-gosen
GNU Lesser General Public License v2.1
0 stars 0 forks source link

test fail using java 7 #28

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Usiing Java 1.7.0, 3 test fail.

--
    [junit] NOTE: Mac OS X 10.7.3 amd64/Oracle Corporation 1.7.0_04-ea (64-bit)/cpus=4,threads=1,free=70361816,total=80609280
    [junit] ------------- ---------------- ---------------
    [junit] Testcase: testDecomposition3(net.java.sen.BasicDecompositionTest):  FAILED
    [junit] expected:<7> but was:<5>
    [junit] junit.framework.AssertionFailedError: expected:<7> but was:<5>
    [junit]     at net.java.sen.SenTestUtil.compareTokens(SenTestUtil.java:174)
    [junit]     at net.java.sen.BasicDecompositionTest.testDecomposition3(BasicDecompositionTest.java:181)
    [junit]     at org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:432)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:147)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
    [junit] 
    [junit] 
    [junit] TEST net.java.sen.BasicDecompositionTest FAILED

    [junit] [BasicDecompositionTest, CommentFilterTest, CompositeTokenFilterTest, CompoundWordFilterTest, NumberFilterTest, OverrideFilterTest, ReadingProcessorTest, SentenceTest, SpaceTest, TrieSearcherTest, TestCharArrayIterator, TestGosenAnalyzer, TestGosenBasicFormFilter, TestGosenKatakanaStemFilter, TestGosenTokenizer]
    [junit] NOTE: Mac OS X 10.7.3 amd64/Oracle Corporation 1.7.0_04-ea (64-bit)/cpus=4,threads=1,free=169866504,total=227409920
    [junit] ------------- ---------------- ---------------
    [junit] Testcase: testTwoSentences(org.apache.lucene.analysis.gosen.TestGosenTokenizer):    FAILED
    [junit] term 3 expected:<マシュー[]> but was:<マシュー[・ホプキンス]>
    [junit] junit.framework.AssertionFailedError: term 3 expected:<マシュー[]> but was:<マシュー[・ホプキンス]>
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:123)
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:186)
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:202)
    [junit]     at org.apache.lucene.analysis.gosen.TestGosenTokenizer.testTwoSentences(TestGosenTokenizer.java:96)
    [junit]     at org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:432)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:147)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
    [junit] 
    [junit] 
    [junit] Testcase: testDecomposition3(org.apache.lucene.analysis.gosen.TestGosenTokenizer):  FAILED
    [junit] term 3 expected:<マシュー[]> but was:<マシュー[・ホプキンス]>
    [junit] junit.framework.AssertionFailedError: term 3 expected:<マシュー[]> but was:<マシュー[・ホプキンス]>
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:123)
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:186)
    [junit]     at org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:202)
    [junit]     at org.apache.lucene.analysis.gosen.TestGosenTokenizer.testDecomposition3(TestGosenTokenizer.java:71)
    [junit]     at org.apache.lucene.util.LuceneTestCase$2$1.evaluate(LuceneTestCase.java:432)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:147)
    [junit]     at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
    [junit] 
    [junit] 
    [junit] TEST org.apache.lucene.analysis.gosen.TestGosenTokenizer FAILED

Original issue reported on code.google.com by johtani on 1 Apr 2012 at 6:23

GoogleCodeExporter commented 8 years ago
Changing Version of Unicode cause this issue.

General Category of 0x30FB(・) is Connector_Punctuation (Java6 : Unicode 4.0), 
but it's Other_Punctuation (Java7 : Unicode 6.0).
See: http://www.unicode.org/reports/tr44/tr44-4.html#Change_History

attached files:
 issue28_punctuation.patch - patch file
 java6_katakana_list.txt - java6 Character.getType list( 0x30A0-0x30FF)
 java7_katakana_list.txt - java7 Character.getType list( 0x30A0-0x30FF)

Original comment by johtani on 4 Apr 2012 at 2:19

Attachments:

GoogleCodeExporter commented 8 years ago
commit
trunk r187 r188
4x r189

Original comment by johtani on 4 Apr 2012 at 2:59