Closed flywithyu closed 7 years ago
@flywithyu, hello! I don't think that NLPIR supports that. PyNLPIR just returns whatever NLPIR does behind the scenes.
You can see what NLPIR supports in its documentation: https://github.com/NLPIR-team/NLPIR/blob/master/NLPIR%20SDK/NLPIR-ICTCLAS/doc/NLPIR-ICTCLAS%E5%88%86%E8%AF%8D%E7%B3%BB%E7%BB%9F%E5%BC%80%E5%8F%91%E6%89%8B%E5%86%8C2016%E7%89%88.pdf
Thanks. I accomplish this by searching the same word in segments.
for k in range(1, segments.__len__()+1):
if key_word == segments[k-1][0]:
# 设置词性集合
r2 = '名词:人名 名词:人名:汉语姓氏 名词:人名:汉语名字 名词:人名:日语人名 名词:人名:音译人名' \
'名词:地名 名词:地名:音译地名' \
'名词:机构团体名 名词:其它专名 ' \
'动词 动词:名动词 动词:副动词 动词:不及物动词 动词:趋向动词 动词:行事动词 动词:动词性惯用语' \
'形容词 形容词:副形词 形容词:形容词性惯用语' \
'数词 数词:数量词' \
'量词 量词:动量词 量词:时量词' \
'副词 介词 连词 助词 叹词 语气词 拟声词' \
'区别词 区别词:区别词性惯用语'
# 如果关键词词性符合上述集合
if (segments[k-1][1]=='名词') or (r2.find(segments[k-1][1])== -1):
# 如果关键词词性不在上述集合中,则将关键词写入文件
sheet1.write(i, 5, segments[k-1][0])
bWrite = False # 设置布尔变量为假
break
else:
break
Another question: Now I want to use self-define dictionary. But I find that even I set words like "鸡精" in self-define dictionary, the function get_key_words or segment still obtain the word "鸡". How can I solve this problem? I have tried nlpir.ImportUserDict('D:/user.txt',True) and nlpir.AddUserWord('鸡精 n').
You might want to check your dict file format and file encoding. See this issue for more information: https://github.com/tsroten/pynlpir/issues/41
when i use pynlpir.get_key_words, can i get the parts of the speech of the key words? I try to use pynlpir.segment, but find that the words obtained from pynlpir.get_key_words and pynlpir.segment may be different. Thank you and best wishes.