botpress / nlu

This repo contains every ML/NLU related code written by Botpress in the NodeJS environment. This includes the Botpress Standalone NLU Server.
23 stars 21 forks source link

Hi! How to generate the language server file *.bpe.model and *.bin, such as Chinese.Thanks! #54

Closed wuguokai closed 3 years ago

wuguokai commented 3 years ago

Hi! How to generate the language .bpe.model and .bin file, such as Chinese.Thanks! image

franklevasseur commented 3 years ago

Hi @wuguokai,

As you've probably noticed, we host few languages models and tokenizers models at this URL. Supported languages are the following ones:

image

Unfortunatly, we don't currently support chinese language models, but there is a workaround.

As stated in these 2 forums threads:

chinese-support turkish-nlp-about

you can download desired language models here and bpe models here.

I strongly suggest using the biggest available models (300 dimension and 200000 vocab size) for performance reasons.

Let me know if you have any other question,

François

franklevasseur commented 3 years ago

Hi again,

based on this forum thread I assume you found a solution to your problem.

For this reason, I'll close this issue, but feel free to reopen if needed.

François

wuguokai commented 3 years ago

@franklevasseur Yes! Thanks a lot!