3Top / word2vec-api

Simple web service providing a word embedding model
http://www.3top.com
1.43k stars 355 forks source link

Encoding issue #17

Open mquillot opened 7 years ago

mquillot commented 7 years ago

Hi !

Maybe you can add an option when lauching the script. unicode_error= '....'

You can define this variable when loading the model thanks gensim. To avoid editing the script, maybe you can offer the possibility to the user to add this option.

Line of code :

model = models.Word2Vec.load_word2vec_format(inputfile, binary=$binary, unicode_errors=$error) 
(with true variables) 

without it, i've some errors with my model.

Thanks a lot for you work. Bye ;)

lechatpito commented 7 years ago

Good idea. Feel free to submit a PR if I take too long to do it 😉