How can I install the spacy.en model except the method provided?

iamaaditya / VQA_Demo

Visual Question Answering Demo on pretrained model

http://iamaaditya.github.io/2016/04/visual_question_answering_demo_notebook

MIT License

242 stars 133 forks source link

How can I install the spacy.en model except the method provided? #6

Closed bityangke closed 8 years ago

bityangke commented 8 years ago

When I install the spacy.en model using "sputnik --name spacy install en" it was very slow, and failed, so I can not install model until now. Is there other ways I can do the same job?

iamaaditya commented 8 years ago

@bityangke You can do the same job using NLTK's word2vec see this. As long as there is a function word_embeddings which takes word and gives its embeddings value, this will work.

Maybe @honnibal might be able to help you with installation of spacy.en

honnibal commented 8 years ago

Maybe try again? Yesterday we moved hosts, so it's possible the DNS propagation interfered with your transfer.

Btw, I think that NLTK word2vec tutorial describes training the word vectors, not using them?

iamaaditya commented 8 years ago

You are right, that tutorial describes training. To use pre-trained word vectors, all you need are following two lines --

import gensim

model = gensim.models.Word2Vec.load_word2vec_format('./model/GoogleNews-vectors-negative300.bin', binary=True) (there are multiple sources to download this pre-trained word vector)

Like @honnibal suggested, please try again. Spacy is really good library.

bityangke commented 8 years ago

Thanks @iamaaditya and @honnibal very much, I have solve this problem. But when I run: word_embeddings = spacy.load('en', vectors='en_glove_cc_300_1m_vectors') I encountered: RuntimeError: Language not supported: en_glove_cc_300_1m_vectors. I am now working on this problem.

honnibal commented 8 years ago

Aaah, sadness! Sorry about this. This was a regression introduced in 0.101.0. It's fixed in 1.0 (out next week!)

The workaround is to add the following line when you import spaCy:

spacy.set_lang_class('en_glove_cc_300_1m_vectors', None)

bityangke commented 8 years ago

@honnibal Thanks very much! It works! At the beginning, I tried like the loading of spacy.en: spacy.set_lang_class(en.en_glove_cc_300_1m_vectors.lang, en.en_glove_cc_300_1m_vectors) Haha!

bityangke commented 8 years ago

@iamaaditya Thank you very much for your wonderful work！ I have tried different pics and questions， the results are perfect！