Pomax / node-jp-conversion

Japenese input conversion, turning any string into {kanji, katakana, hiragana, romaji} where possible.
16 stars 7 forks source link

Porting to Rust #1

Open kitallis opened 5 years ago

kitallis commented 5 years ago

Hi @Pomax,

I reached out to you on IRCHighway#nihongo but I think that place is pretty dead. I've recently started learning both Rust and Japanese (nrGrammar is fantastic!) and decided to attempt this: https://github.com/kitallis/konj.

It's still WIP, but a basic romaji → kana transformation works.

Let me know if you have thoughts on implementing this, say, if you were to give this another shot.

Also, if you have any thoughts on how feasible a rough (predictive) kana → kanji transformation is.

Pomax commented 5 years ago

there's no "prediction" when it comes to kana->kanji conversion: it's the same level of hard as converting spoken Japanese into written Japanese. You need a full corpus tree (so you can find candidate kanjiforms based on kana sequences) and then to get it right for longer terms, you typically also need a grammar parser to make sure that your conversion makes sense.

If I were to redo this, for kana-to-kanji I'd probably make it a two step process, with tree based client-side lookup for simple conversion, and a server-side process that takes the input and runs it through a parser like mecab, and then return its result.