polm / cutlet

Japanese to romaji converter in Python
https://polm.github.io/cutlet/
MIT License
286 stars 20 forks source link

Cutlet converts こんにちは to Konnichiha instead of Konnichiwa #33

Closed whiteeat closed 1 year ago

whiteeat commented 1 year ago

Cutlet converts こんにちは to Konnichiha instead of Konnichiwa, is it an intentional behaviour or a bug? Because こんにちは should be read as Konnichiwa. こんにちはGoogle こんにちはCutlet

polm commented 1 year ago

Thanks for reporting that. The issue seems to be that in UniDic 「こんにちは」 is a single token, so it's not possible to recognize the は as the particle in the usual fashion.

If this is an issue for your application, I would recommend registering an exception for こんにちは, as outlined in the README.

polm commented 1 year ago

This should be resolved in the demo now.