tsroten / pynlpir

A Python wrapper around the NLPIR/ICTCLAS Chinese segmentation software.
MIT License
566 stars 135 forks source link

Unrecognized POS key #20

Closed lijianbo0130 closed 9 years ago

lijianbo0130 commented 9 years ago

u'\u7ea2\u70e7\u8089' Traceback (most recent call last): File "C:\Users\asus\workspace\py2\src\NLP\np\�ִ�.py", line 23, in b=pynlpir.segment(s) File "C:\Python27\lib\site-packages\pynlpirinit.py", line 183, in segment pos_name = _get_pos_name(token[1], pos_names, pos_english) File "C:\Python27\lib\site-packages\pynlpirinit.py", line 129, in _get_pos_name pos_name = pos_map.get_pos_name(code, name, english) File "C:\Python27\lib\site-packages\pynlpir\pos_map.py", line 183, in get_pos_name return _get_pos_name(code, name, english) File "C:\Python27\lib\site-packages\pynlpir\pos_map.py", line 149, in _get_pos_name pos_code) ValueError: part of speech not recognized: 'gms' i do not know why that happen. if i divide the words into three pieces there is not error but i get them into a sentence the error occur i try to go to this page http://202.38.128.96:96/nlpir/ to segment the words there is nothing wrong i guess maybe there are some thing wrong with the decode

tsroten commented 9 years ago

Thanks for reporting this bug. It seems like NLPIR is returning an undocumented part of speech code (I can't find any indication what gms might mean). In cases like this, the part of speech should be set to None. I'll commit a fix for this shortly.

lijianbo0130 commented 9 years ago

thank you!