infinilabs / analysis-pinyin

🛵 This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
Apache License 2.0
2.94k stars 547 forks source link

单韵母开头的中文如“阿莫西林” “鹅” 拼音首字母分析不出来 #245

Open t163ang opened 3 years ago

t163ang commented 3 years ago

设置如下: { "type" : "pinyin", "keep_first_letter": false, "keep_separate_first_letter" : true, "keep_full_pinyin" : false, "keep_original" : false, "limit_first_letter_length" : 16, "lowercase" : true, "remove_duplicated_term" : false, "ignore_pinyin_offset": false } 对阿莫西林进行分析: { "tokens": [ { "token": "m", "start_offset": 1, "end_offset": 2, "type": "word", "position": 1 }, { "token": "x", "start_offset": 2, "end_offset": 3, "type": "word", "position": 2 }, { "token": "l", "start_offset": 3, "end_offset": 4, "type": "word", "position": 3 } ] } 发现漏掉第一个中文阿“a”拼音首字母

hsqStephenZhang commented 1 year ago

代码里写的是 config.keepSeparateFirstLetter & pinyin.length() > 1,才会生成首字母的拼音,在 keepFullPinyin=false 的情况下,感觉有点问题