Closed KayShen closed 5 years ago
@KayShen Thanks for pointing this out.
For anyone that is interested, I'd be happy to accept a pull request. Here is the file that needs to be updated: https://github.com/tsroten/pynlpir/blob/develop/pynlpir/pos_map.py
Some of the new tags were added to PyNLPIR recently (on PyPI). Also, there is a new pos_names
option: raw
. This will simply return whatever NLPIR provides as the part of speech tag. This is a workaround for any other tags that might be missing.
Also, like before, you can still pass your own part of speech mapping in with any missing tags.
as updated in NLPIR: https://github.com/NLPIR-team/NLPIR/blob/5e4f0a6a35906472a8ddd4f8457c13eb03174204/NLPIR%20SDK/DocExtractor/Data/UserDefinedDict.lst
POS tag like 'gtw', 'gwheb', 'grjyy', etc. are not recognized in this version.
For example: pynlpir.segment(u"接受党和国家领导人接见接受央视北京卫视北京日报新京报世纪英语报等媒体的采访")