Closed dwbaron closed 3 years ago
Thanks for your suggestions, we will fix the problem that can not detect proper noun formed by digital and English alphabet.
I tried to combined trie (which perform the exactly match) and hmm seg to fix such problem temporarily.
it seems that u first use zh-char to split the sentence, use eng-char seems better?
I try to figure out this en-char problems follow my above solution
Thanks for your suggestions! I will fix the problem when I free. If you are willing to give contributions to this repo, you can create a PR! Look forward to your contributions!
类似债券简称,比如“02进出04”,特殊名词比如“5G”,我发现在分词的时候会打散