masaruh / elasticsearch-japanese-suggester

10 stars 7 forks source link

What is the reference of KeyStrokeMapping? #39

Open sabe5 opened 8 years ago

sabe5 commented 8 years ago

Hi,

I'd like to append more mappings to KeyStrokeMapping.json from microsoft e.g. セ:["se","ce"] etc. Do I have to custom the json file myself? And where is the reference of the KeyStrokeMapping content for current implement? If you'd like to add more from that microsoft page, then the following link may be useful. Thank you.

masaruh commented 8 years ago

I haven't used any reference. Thanks for the reference. I'll use it. Currently, if you'd like to add mappings, you need to edit the json file yourself.

sabe5 commented 8 years ago

Thank you for the info. understood.

BTW, what's your plan for multiple reading case? e.g. 夫婦,1285,1285,1709,名詞,一般,,,,,夫婦,フウフ,フーフ 夫婦,1285,1285,7970,名詞,一般,,,,,夫婦,メオト,メオト 寒気,1285,1285,5221,名詞,一般,,,,,寒気,サムケ,サムケ 寒気,1285,1285,6036,名詞,一般,,,,,寒気,カンキ,カンキ

masaruh commented 8 years ago

Kuromoji itself doesn't really handle those words with multiple readings. Besides, KuromojiSuggestTokenizer expects only one way of tokenization is emitted from Kuromoji right now.

That's said, you can add readings yourself like "input": ["夫婦", "フウフ", "メオト"].

sabe5 commented 8 years ago

okay, thanks for the clarification!