Open ScoreUnder opened 6 years ago
Thanks for the detailed feedback! Do you know whether the same problem appears in Kuromoji?
On the other side, I can imagine cases where someone would prefer 一丁目 to be translated to 1chome rather than Icchome.
Another problem is that the program outputs kanjis (such as 五Hiki
in your example), I am not sure why but that's a big problem indeed.
Cheers!
Related: https://github.com/atilika/kuromoji/issues/125
Apparently switching to UniDic (might be as simple as modifying pom.xml)would solve that particular case, but it might have lower performance in other areas.
As the title says.
I've been looking for a library to break kanji down into their readings (preferably hiragana), and my first test with them is to see how they fare with the 〜匹 counters.
For reference, this is the expected output for the first 10:
However, this is the output the program creates: