Kyubyong / wordvectors

Pre-trained word vectors of 30+ languages
MIT License
2.21k stars 393 forks source link

What tokenizer for Bahasa? #7

Open kenyeung128 opened 7 years ago

kenyeung128 commented 7 years ago

Hi, might i ask which tokenizer do u use for Bahasa (Indonesia)? Thanks.

Kyubyong commented 7 years ago

I didn't use any extra tokenizer for Indonesian because Indonesian contains spaces.

sathik11 commented 6 years ago

Bahasa fasttext vector embedding link is broken , can help to upload ?

Kyubyong commented 6 years ago

Actually all the fasttext vector links were broken, which I don't know why. I've just updated them. Thanks.