jimczi / nori

A set of HOW-TOs to help customize Nori, the Lucene's Korean Analyzer
16 stars 4 forks source link

custom-dic ant regenerate java exception #1

Open dongshik opened 5 years ago

dongshik commented 5 years ago

when i try ant regenerate of nori occur Exception i "tar xvf mecab-ko-dic-2.0.3-20170922.tar.gz" and didn't added custom dic just run "ant regenerate" for build test....

like this

build-dict: [java] dictionary builder [java] [java] input directory: /Users/dongsik/git/develop_with_elk/lucene-solr/lucene/build/analysis/nori/mecab-ko-dic-2.0.3-20170922 [java] output directory: /Users/dongsik/git/develop_with_elk/lucene-solr/lucene/analysis/nori/src/resources [java] input encoding: utf-8 [java] normalize entries: false [java] [java] building tokeninfo dict... [java] parse... [java] sort... [java] encode... [java] Exception in thread "main" java.lang.AssertionError [java] at org.apache.lucene.analysis.ko.util.BinaryDictionaryWriter.put(BinaryDictionaryWriter.java:120) [java] at org.apache.lucene.analysis.ko.util.TokenInfoDictionaryBuilder.buildDictionary(TokenInfoDictionaryBuilder.java:123) [java] at org.apache.lucene.analysis.ko.util.TokenInfoDictionaryBuilder.build(TokenInfoDictionaryBuilder.java:64) [java] at org.apache.lucene.analysis.ko.util.DictionaryBuilder.build(DictionaryBuilder.java:30) [java] at org.apache.lucene.analysis.ko.util.DictionaryBuilder.main(DictionaryBuilder.java:64)

BUILD FAILED /Users/dongsik/git/develop_with_elk/lucene-solr/lucene/analysis/nori/build.xml:83: Java returned: 1

Total time: 8 seconds

ylee-cs commented 4 years ago

I am experiencing the same issue. It occurs when building a dictionary for unk.def. It is not clear that this issue is caused by nori or by mecab korean dictionary. However, I could rebuild kuromoji analyzer successfully, so I suspect that it is related to mecab korean dictionary. This is because nori is derived from kuromoji, and the two analyzers are similar to each other.