Kyubyong / wordvectors

Pre-trained word vectors of 30+ languages
MIT License
2.22k stars 392 forks source link

korean language #26

Open trungluu91 opened 2 years ago

trungluu91 commented 2 years ago

I use it with the korean language in gensim 4.0.x. thus I used KeyedVectors.load('ko.bin') and KeyedVectors.load_word2vec_format('ko.bin'), but there was an error 'UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte'. Could I ask about the error of korea language pre-trained word2vec.

EugeneYoo commented 2 years ago

I used the korean model well, but there was an error while moving the project, so I checked and found that the installed gensim version was 4.0.x version.

Try gensim 3.8.3. Maybe it can be solved.

JudePark96 commented 2 years ago

try model = Word2Vec.load('ko.bin') on gensim 3.8.3. It works on my environment.

trungluu91 commented 2 years ago

Thanks guys for guiding me