neologd / mecab-ipadic-neologd

Neologism dictionary based on the language resources on the Web for mecab-ipadic
Other
2.7k stars 288 forks source link

Failed to build lucene-kuromoji because mecab-user-dict-seed.20180920.csv contain invalid format. #48

Closed dAu6jARL closed 5 years ago

dAu6jARL commented 5 years ago

mecab-user-dict-seed.20180920.csv contains invalid CSV format as follows.

line 2236780:

放虫,1283,1283,6095,名詞,サ変接続,**,*,*,放虫,ホウチュウ,ホーチュー,[unknown:_:17793 17615 6095]

line 2473240:

死着,1283,1283,6095,名詞,サ変接続,**,*,*, 死着,シチャク,シチャク,[unknown:_:13429 13295 6095]

neologd commented 5 years ago

Sorry for this error. This error will correct by releasing new data(mecab-user-dict-seed.20180927.csv) on a few hours later.

This error was caused by new process to generate a dictionary resource. We forgot to add a process to testing the output data from this new process. And we added the process to testing it.

dAu6jARL commented 5 years ago

I checked success to build lucene-kuromoji by mecab-user-dict-seed.20180927.csv. Thank you for your response and solution.