Closed varunnrao closed 6 years ago
I have same issue
Yes, the problem is that for each token, spacy is returning a 384 dim vector instead of 300. One quick fix is to take first 300 values only like - question_tensor[0,j,:] = tokens[j].vector[:300] since the VQA model takes a 300 length vector as word_feature_size.
I'm getting this error is I reduce the size.
UserWarning: Trying to unpickle estimator LabelEncoder from version pre-0.18 when using version 0.19.1. This might lead to breaking code or invalid results. Use at your own risk.
Does anyone know what this is? Urgent help needed
I solved the same issue with the vector dimension and also the userwarning dedicated to newer scikit version (it is neccessary to replicke the file with joblib.dump). After these changes I am not able to replicate your results while downloading pretrained models. The test image and question gives best answer as "30 % - electricity"instead of train. All what questions result in number answer, Where questions result in yes/no answer. Can you tell me, whether the dimension reduction should result in such a distortion?
Yes it will. How do we fix this? Can anyone please help. Why is the vector size 384 when it should be 300?
I tried the following
word_embedding = spacy.load('en', vectors = 'en_glove_cc_300_1m_vectors')
tokens = word_embedding(question)
word_embeddings = word_embedding.vocab.vectors.resize((1000000, 300))
question_tensor = np.zeros((1, 30, 300))
for j in range(len(tokens)):
question_tensor[0,j,:] = tokens[j].vector
return question_tensor
Even after resizing the vectors, the error is removed but it is giving wrong answers. No idea what to do :/ I tried really hard but couldn't find anything online too.
I will take a look at the code. Could you tell me the version of your Keras and Tensorflow so that I can test it correctly.
On Wed, May 2, 2018, 11:01 AM theanuragsrivastava notifications@github.com wrote:
I tried the following
word_embedding = spacy.load('en', vectors = 'en_glove_cc_300_1m_vectors') tokens = word_embedding(question) word_embeddings = word_embedding.vocab.vectors.resize((1000000, 300)) question_tensor = np.zeros((1, 30, 300)) for j in range(len(tokens)): question_tensor[0,j,:] = tokens[j].vector return question_tensor
Even after resizing the vectors, the error is removed but it is giving wrong answers. No idea what to do :/ I tried really hard but couldn't find anything online too.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/iamaaditya/VQA_Demo/issues/15#issuecomment-386066988, or mute the thread https://github.com/notifications/unsubscribe-auth/AB008mS1JKHM7ma5k9QYHWcc1XOM4MtZks5tufRpgaJpZM4RV8Id .
--
Thanks and Regards Adi
I'm using the following versions
Keras = 2.0.5 Tensorflow = 1.2.0
Issue is because Spacy updated the pretrained word embeddings model. Do the following to fix the issue.
python -m spacy download en_vectors_web_lg
word_embeddings = spacy.load('en', vectors='en_glove_cc_300_1m_vectors')
to
word_embeddings = spacy.load('en_vectors_web_lg')
when is execute the following
model = gensim.models.KeyedVectors.load_word2vec_format('./data/GoogleNews-vectors-negative300.bin.gz',
binary=True)`
i get an error saying
ValueError: could not broadcast input array from shape (75) into shape (300)
can anyone help me Thanks In advance!!!!!!!!!!!!
Is there an issue with the spacy tensor?