Closed bityangke closed 8 years ago
@bityangke You can do the same job using NLTK's word2vec see this. As long as there is a function word_embeddings which takes word and gives its embeddings value, this will work.
Maybe @honnibal might be able to help you with installation of spacy.en
Maybe try again? Yesterday we moved hosts, so it's possible the DNS propagation interfered with your transfer.
Btw, I think that NLTK word2vec tutorial describes training the word vectors, not using them?
You are right, that tutorial describes training. To use pre-trained word vectors, all you need are following two lines --
import gensim
model = gensim.models.Word2Vec.load_word2vec_format('./model/GoogleNews-vectors-negative300.bin', binary=True) (there are multiple sources to download this pre-trained word vector)
Like @honnibal suggested, please try again. Spacy is really good library.
Thanks @iamaaditya and @honnibal very much, I have solve this problem. But when I run: word_embeddings = spacy.load('en', vectors='en_glove_cc_300_1m_vectors') I encountered: RuntimeError: Language not supported: en_glove_cc_300_1m_vectors. I am now working on this problem.
Aaah, sadness! Sorry about this. This was a regression introduced in 0.101.0. It's fixed in 1.0 (out next week!)
The workaround is to add the following line when you import spaCy:
spacy.set_lang_class('en_glove_cc_300_1m_vectors', None)
@honnibal Thanks very much!
It works!
At the beginning, I tried like the loading of spacy.en:
spacy.set_lang_class(en.en_glove_cc_300_1m_vectors.lang, en.en_glove_cc_300_1m_vectors)
Haha!
@iamaaditya Thank you very much for your wonderful work! I have tried different pics and questions, the results are perfect!
When I install the spacy.en model using "sputnik --name spacy install en" it was very slow, and failed, so I can not install model until now. Is there other ways I can do the same job?