Open t163ang opened 3 years ago
设置如下: { "type" : "pinyin", "keep_first_letter": false, "keep_separate_first_letter" : true, "keep_full_pinyin" : false, "keep_original" : false, "limit_first_letter_length" : 16, "lowercase" : true, "remove_duplicated_term" : false, "ignore_pinyin_offset": false } 对阿莫西林进行分析: { "tokens": [ { "token": "m", "start_offset": 1, "end_offset": 2, "type": "word", "position": 1 }, { "token": "x", "start_offset": 2, "end_offset": 3, "type": "word", "position": 2 }, { "token": "l", "start_offset": 3, "end_offset": 4, "type": "word", "position": 3 } ] } 发现漏掉第一个中文阿“a”拼音首字母
代码里写的是 config.keepSeparateFirstLetter & pinyin.length() > 1,才会生成首字母的拼音,在 keepFullPinyin=false 的情况下,感觉有点问题
config.keepSeparateFirstLetter & pinyin.length() > 1
设置如下: { "type" : "pinyin", "keep_first_letter": false, "keep_separate_first_letter" : true, "keep_full_pinyin" : false, "keep_original" : false, "limit_first_letter_length" : 16, "lowercase" : true, "remove_duplicated_term" : false, "ignore_pinyin_offset": false } 对阿莫西林进行分析: { "tokens": [ { "token": "m", "start_offset": 1, "end_offset": 2, "type": "word", "position": 1 }, { "token": "x", "start_offset": 2, "end_offset": 3, "type": "word", "position": 2 }, { "token": "l", "start_offset": 3, "end_offset": 4, "type": "word", "position": 3 } ] } 发现漏掉第一个中文阿“a”拼音首字母