Open vietqhoang opened 9 years ago
Ah, the ambiguities of language! :)
All breakdowns here, both the actual and the expected are ok parsings of these sentences.
Personally I prefer the prefix お to be parsed as a separate word. But you could either write some post processing logic to combine prefix-お with the following word.
For やぐら and 煮っころがし you could add them as words to a custom dictionary like I explained in #22. But there is no guarantee that mecab will parse them correctly even so, it depends on cost values.
Case 1
Actual:
Expected:
Case 2
Actual:
Expected:
Case 3
Actual:
Expected: