mynlp / jigg

Pipeline framework for easy natural language processing
Apache License 2.0
74 stars 20 forks source link

Japanese model #98

Closed sangtb92 closed 5 years ago

sangtb92 commented 5 years ago

Hi, I am interested in use Jigg with Japanese. Are you working on Japanese? And how can I start building a Japanese model for NER? Thanks!

hiroshinoji commented 5 years ago

Thank you for your interest. If your goal is Japanese NER, maybe the current choice is only using KNP, for which Jigg provides a wrapper with more readable format and parallel processing. Unfortunately currently we don't support JUMAN++, which provides superior results to JUMAN as a preprocess of KNP. We will add JUMAN++ to our TODO and will support it soon.

The basic command to have a pipeline with JUMAN and KNP is -annotators "ssplit,juman,knp". Before calling, JUMAN and KNP should be installed on the machine and added to the path.

sangtb92 commented 5 years ago

Thank you for your suggestion! 👍