dasmith / stanford-corenlp-python

Python wrapper for Stanford CoreNLP tools v3.4.1
GNU General Public License v2.0
610 stars 229 forks source link

Error when processing Chinese text #27

Closed hitalex closed 9 years ago

hitalex commented 9 years ago

After I start the server (with trained Chinese models and properties file), I test the server with a Chinese sentence by replacing the example English sentence in client.py, i.e.

#result = nlp.parse(u"Hello world!  It is so beautiful.")
result = nlp.parse(u"今天天气真不错啊!")

Traceback (most recent call last): File "client.py", line 17, in result = nlp.parse(u"今天天气真不错啊!") File "client.py", line 13, in parse return json.loads(self.server.parse(text)) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 934, in call return self.req(self.name, args, kwargs) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 907, in req resp = self.data_serializer.loads_response( resp_str ) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response raise RPCInternalError(error_data) jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Could you show me how to fix this?

hitalex commented 9 years ago

I fix this problem by myself. The problem is not in transporting the text, but lines in the pexpect module. One should upgrade pexpect to v3 and use pexpect.spawnu to properly handle utf8 encodings. For details, see: http://pexpect.readthedocs.org/en/latest/api/pexpect.html#handling-unicode

wanglan0605 commented 9 years ago

When I try to parse Chinese texts with more than 1500 or 2000 Chinese words, the "result" returns nothing,just [ ]. Could you please show me how to fix it?