I like what you have as a simple python solution (compared to other word segmentation libraries out there with lots of compilation issues that I ran into).
I tried wordcut with documents I have and the library works okay. However, I ran into many problems with ๆ and it normally would cut the words with ๆ mixed in.
When I just add "ๆ" as the first entry in the dictionary file. It appears to solve that kind of problems. Is it is the right thing to do?
Hi Vee,
I like what you have as a simple python solution (compared to other word segmentation libraries out there with lots of compilation issues that I ran into).
I tried wordcut with documents I have and the library works okay. However, I ran into many problems with ๆ and it normally would cut the words with ๆ mixed in.
When I just add "ๆ" as the first entry in the dictionary file. It appears to solve that kind of problems. Is it is the right thing to do?