snipsco / snips-nlu

Snips Python library to extract meaning from text
https://snips-nlu.readthedocs.io
Apache License 2.0
3.89k stars 513 forks source link

Could you please add Chinese language? #857

Open martinchai16 opened 4 years ago

martinchai16 commented 4 years ago

Question Hi sir, Is it possible for you to add Chinese to the program, please?

adrienball commented 4 years ago

Hi @martinchai16 Unfortunately we cannot allocate time for this at the moment. Here is some things you can try to do to have something working in chinese:

1) Find a way to tokenize your data so that the utterances that appear in the dataset contains whitespace separated tokens (IIRC chinese language is not tokenized). You can probably find some library that does that for you. 2) Change the language of your dataset to english ("en"), which means the NLU will treat your data as if it were english.

At this point, you may have something that works OK. If you want to got a bit further, you can try to update the english resources like the gazetteers and the list of stop words, in order to provide chinese values.

I hope this helps.