tsroten / pynlpir

A Python wrapper around the NLPIR/ICTCLAS Chinese segmentation software.
MIT License
566 stars 135 forks source link

PyNLPIR/ICTCLAS Segmentation Tool #65

Closed rafael75012 closed 7 years ago

rafael75012 commented 7 years ago

I know PyNLPIR is ICTCLAS wrapper, but I don't manage to find neither to get any confirmation about on which corpora has been trained ICTCLAS. In (Zhang, 2002) it is explained it was PKU at this time, but I don't manage to know if its still the case for ICTCLAS2015. Thanks so much if you could provide me any information.

tsroten commented 7 years ago

I would try reaching out to Dr. Kevin Zhang directly. I've communicated with him in the past and found him to be responsive. You can find his contact information on the NLPIR GitHub repo.

rafael75012 commented 7 years ago

Thanks,

He responded me that the corpus is one month of the People's Daily.