Kyubyong / wordvectors

Pre-trained word vectors of 30+ languages
MIT License
2.22k stars 393 forks source link

Loading embeddings #22

Open Joseph94m opened 5 years ago

Joseph94m commented 5 years ago

Hi,

I downloaded the French embeddings, and extracted the zip file. How can I load these embeddings in a python code and return the embeddings for a specified word, e.g.: embedding("bonjour") -----> [0.2, -0,2, etc...]

Thanks

nvietsang commented 5 years ago

You can use gensim to load the .bin model:

from gensim.models import Word2Vec model = Word2Vec.load("vi.bin") model.wv['nhà']

Then, you will get the embedding vector of the word "nhà", in Vietnamese for example. Remember to install gensim library

fadeawaygod commented 5 years ago

But it didn't work with FastText, below is my code:

from gensim.models import FastText

model_f = FastText.load("zh.bin") v = model_f.wv['你好']`

It throws a exception: Exception has occurred: _pickle.UnpicklingError invalid load key, ','.

fadeawaygod commented 5 years ago

But it didn't work with FastText, below is my code:

from gensim.models import FastText

model_f = FastText.load("zh.bin") v = model_f.wv['你好']`

It throws a exception: Exception has occurred: _pickle.UnpicklingError invalid load key, ','.

I fixed it by replacing load with load_fasttext_format.