Open keon opened 8 years ago
depends on what algorithms you are extending, you can look around the library youll see alot of things with the suffix _fr.js
for french etc... theres no formal documentation and not really a standard way of doing things yet though it has been discussed (see #228 and previously #159)
@kkoch986 Ok, I have been studying the code, but Japanese Tokenizer seems to be really apart from others. Which should be true because Asian languages are different from Roman & greek based language system.
I think I am going to imitate whats written on the Japanese tokenizer, since Korean and Japanese are really similar. But I am not sure how this will be organized in the future.
I think one of the urgent thing on the todo list should be organizing the language system.
I am planning to add Korean Language support of this library. Is there any example or guide besides in the code in the package that might help me to add this functionality?