fxsjy / jieba

结巴中文分词
MIT License
33.39k stars 6.73k forks source link

bug: 自定义词典添加文本表情不生效 #929

Open idiomer opened 3 years ago

idiomer commented 3 years ago

如下所示:有括号的自定义表情能添加但分词不work

import jieba
biaoqing_list = ['[捂脸]', '[doge]',  '___捂脸___',  '___doge___']
for x in biaoqing_list:
    jieba.add_word(x, freq=10000, tag='nz')
print(jieba.user_word_tag_tab)
print(jieba.lcut('[捂脸][doge]哈哈哈___捂脸___和___doge___'))

# {'[捂脸]': 'nz', '[doge]': 'nz', '___捂脸___': 'nz', '___doge___': 'nz'}
# ['[', '捂脸', ']', '[', 'doge', ']', '哈哈哈', '___捂脸___', '和', '___doge___']
ZionTsao commented 3 months ago

想问下,找到解决方法了吗