tsroten / pynlpir

A Python wrapper around the NLPIR/ICTCLAS Chinese segmentation software.
MIT License
566 stars 135 forks source link

pynlpir.segment(text, pos_names='all') fails for certain texts when pynlpir.segment(text) doesn't fail #52

Closed kensk8er closed 8 years ago

kensk8er commented 8 years ago

pynlpir.segment(text, pos_names='all') fails for certain texts when pynlpir.segment(text) doesn't, which is not an expected behaviour I think.

For example,

When text = u'其中,新增了甲卡西酮、曲马多、安钠咖等12种新类型毒品的定罪量刑数量标准,并下调了在我国危害较为严重的毒品氯胺酮的定罪量刑数量标 准。'

pynlpir.segment(text, pos_names='all') causes the following Error:

No handlers could be found for logger "pynlpir.pos_map"

TypeError Traceback (most recent call last)

in () ----> 1 pynlpir.segment(text, pos_names='all') /Users/kensk8er/anaconda/envs/env2/lib/python2.7/site-packages/pynlpir/**init**.pyc in segment(s, pos_tagging, pos_names, pos_english) 203 token = (token[0], None) 204 if pos_names is not None and token[1] is not None: --> 205 pos_name = _get_pos_name(token[1], pos_names, pos_english) 206 token = (token[0], pos_name) 207 tokens[i] = token /Users/kensk8er/anaconda/envs/env2/lib/python2.7/site-packages/pynlpir/**init**.pyc in _get_pos_name(code, name, english, delimiter) 148 149 """ --> 150 pos_name = pos_map.get_pos_name(code, name, english) 151 return delimiter.join(pos_name) if name == 'all' else pos_name 152 /Users/kensk8er/anaconda/envs/env2/lib/python2.7/site-packages/pynlpir/pos_map.pyc in get_pos_name(code, name, english) 182 183 """ --> 184 return _get_pos_name(code, name, english) /Users/kensk8er/anaconda/envs/env2/lib/python2.7/site-packages/pynlpir/pos_map.pyc in _get_pos_name(pos_code, names, english, pos_map) 158 "look for child name for '%s'" % (pos_entry[1], pos_code)) 159 sub_pos = _get_pos_name(pos_code, names, english, sub_map) --> 160 pos = pos + sub_pos if names == 'all' else (sub_pos, ) 161 name = pos if names == 'all' else pos[-1] 162 logger.debug("Part of speech name found: '%s'" % repr(name) TypeError: can only concatenate tuple (not "NoneType") to tuple I don't fully understand the code, but it looks like it's just the case that you need to add the check wheter `sub_pos` is `None` and do appropriate processing. The issue was found in "Ubuntu 14.04 and OSX 10.11.04". I've installed `pynlpir` by `pip install pynlpir`.