avisingh599 / visual-qa

[Reimplementation Antol et al 2015] Keras-based LSTM/CNN models for Visual Question Answering
https://avisingh599.github.io/deeplearning/visual-qa/
MIT License
481 stars 186 forks source link

How do I test with my own images? #7

Closed arushk1 closed 8 years ago

arushk1 commented 8 years ago

So I have a VGG_feats.mat file obtained by running my own images through a VGGNet. I also have txt file of question(s) about that image.

1) Do I need anymore data to use your net to get answers? I don't need a seperate word2vec net to calculate the features from my questions right? 2) How do I use your net to get answers to my questions about the image?

avisingh599 commented 8 years ago
  1. No, you don't need any more data. Just use the same word vectors that you used while training.
  2. I would suggest you to try and understand the code in the evaluateMLP.py (or evaluateLSTM.py). If you can understand that, then the code can be trivially modified to test on your own image/question pairs. I plan to write a script to do this as soon as I get the time.

P.S. Do not use the default word vectors that come with spaCy. I suggest using Stanford's Glove word vectors, since they give much better results. Here is a guide on replacing word vectors in spaCy: http://spacy.io/tutorials/load-new-word-vectors/

arushk1 commented 8 years ago

What do you mean by same word vectors? The same questions that are present in the training set?

avisingh599 commented 8 years ago

No. It means that the vector representing each word in the vocabulary should be same in the train and the test set. Example: vector['what'] = [1.0 2.3 4.3 .... 0.9] -> this vector should be the same when training/testing.

arushk1 commented 8 years ago

For that I just need to include the statement nlp = English() as in the evaluateMLP.py right?

avisingh599 commented 8 years ago

When you write that statement, spaCy loads whatever word vectors that it has saved on the system. Just ensure that those vectors remain the same while training/testing.

avisingh599 commented 8 years ago

I have now released some pre-trained models and a demo script. However, they script only works with MS COCO images as of now.

arushk1 commented 8 years ago

In line model.load_weights(args.weights)

File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 565, in load_weights self.layers[k].set_weights(weights) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 361, in set_weights self.layers[i].set_weights(weights[:nb_param]) File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 79, in set_weights self.layers[i].set_weights(weights[:nb_param]) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 122, in set_weights if p.eval().shape != w.shape: File "/usr/local/lib/python2.7/dist-packages/theano/gof/graph.py", line 498, in eval self._fn_cache[inputs] = theano.function(inputs, self) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 308, in function output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 526, in pfunc output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1778, in orig_function defaults) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1642, in create input_storage=input_storage_lists, storage_map=storage_map) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 690, in make_thunk storage_map=storage_map)[:3] File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 1037, in make_all no_recycling)) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 918, in make_thunk no_recycling) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 836, in make_c_thunk output_storage=node_output_storage) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1209, in make_thunk keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1147, in compile keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1596, in cthunk_factory key = self.cmodule_key() File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1291, in cmodule_key compile_args=self.compile_args(), File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 959, in compile_args ret += c_compiler.compile_args() File "/usr/local/lib/python2.7/dist-packages/theano/gof/cmodule.py", line 1860, in compile_args native_lines = get_lines("%s -march=native -E -v -" % theano.config.cxx) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cmodule.py", line 1829, in get_lines shell=True) File "/usr/local/lib/python2.7/dist-packages/theano/misc/windows.py", line 36, in subprocess_Popen proc = subprocess.Popen(command, startupinfo=startupinfo, _params) File "/usr/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory arush@arush:~/visual-qa/scripts$ nano own_image.py arush@arush:~/visual-qa/scripts$ nano own_image.py arush@arush:~/visual-qa/scripts$ python own_image.py Ask a question about the image:Is there a ball? Loading Word2vec Loaded word2vec features Traceback (most recent call last): File "own_image.py", line 47, in main(q) File "own_image.py", line 28, in main model.load_weights(args.weights) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 565, in load_weights self.layers[k].set_weights(weights) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 361, in set_weights self.layers[i].set_weights(weights[:nb_param]) File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 79, in set_weights self.layers[i].set_weights(weights[:nb_param]) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 122, in set_weights if p.eval().shape != w.shape: File "/usr/local/lib/python2.7/dist-packages/theano/gof/graph.py", line 498, in eval self._fn_cache[inputs] = theano.function(inputs, self) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 308, in function output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 526, in pfunc output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1778, in orig_function defaults) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1642, in create input_storage=input_storage_lists, storage_map=storage_map) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 690, in make_thunk storage_map=storage_map)[:3] File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 1037, in make_all no_recycling)) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 918, in make_thunk no_recycling) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 836, in make_c_thunk output_storage=node_output_storage) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1209, in make_thunk keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1147, in compile keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1596, in cthunk_factory key = self.cmodule_key() File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1291, in cmodule_key compile_args=self.compile_args(), File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 959, in compile_args ret += c_compiler.compile_args() File "/usr/local/lib/python2.7/dist-packages/theano/gof/cmodule.py", line 1860, in compile_args native_lines = get_lines("%s -march=native -E -v -" % theano.config.cxx) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cmodule.py", line 1829, in get_lines shell=True) File "/usr/local/lib/python2.7/dist-packages/theano/misc/windows.py", line 36, in subprocess_Popen proc = subprocess.Popen(command, startupinfo=startupinfo, _params) File "/usr/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory

avisingh599 commented 8 years ago

OSError: [Errno 12] Cannot allocate memory You need more swap space, or a bigger RAM. I am posting your script here for reference.

import argparse
import random
from PIL import Image
import subprocess
from os import listdir
from os.path import isfile, join

from keras.models import model_from_json

from spacy.en import English
import numpy as np
import scipy.io
from sklearn.externals import joblib

def main(img_name, question):

    parser = argparse.ArgumentParser()
        parser.add_argument('-model', type=str, default='../models/lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3.json')
        parser.add_argument('-weights', type=str, default='../models/lstm_1_num_hidden_units_lstm_512_num_hidden_units_mlp_1024_num_hidden_layers_mlp_3_epoch_070.hdf5')
        parser.add_argument('-sample_size', type=int, default=25)
        args = parser.parse_args()

    nlp = English()
    print 'Loaded word2vec features'
    labelencoder = joblib.load('../models/labelencoder.pkl')

    model = model_from_json(open(args.model).read())
        model.load_weights(args.weights)
        model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

    vgg_model_path = '~/vgg_feats_' + img_name + '.mat'
        features_struct = scipy.io.loadmat(vgg_model_path)
        VGGfeatures = features_struct['feats']

    X_q = nlp(ques)
    X_i = VGGfeatures

    X = [X_q, X_i]

    y_predict = model.predict_classes(X, verbose=0)
    return labelencoder.inverse_transform(y_predict)
arushk1 commented 8 years ago

So this scripts should work right? I forgot to enable swap on my instance

avisingh599 commented 8 years ago

I would think so. Just ensure all the dimensions are correct. And use Glove Word Vectors if you are using my pre-trained models.

arushk1 commented 8 years ago

Yeah Glove vectors are loaded. They are embarrassingly slow to load on the relatively fast instance though.

arushk1 commented 8 years ago

This is the code:

import argparse import random from PIL import Image import subprocess from os import listdir from os.path import isfile, join

from keras.models import model_from_json

from spacy.en import English import numpy as np import scipy.io from sklearn.externals import joblib

def main(question):

parser = argparse.ArgumentParser()
parser.add_argument('-model', type=str, default='../models/lstm_1_num_hidden_$
parser.add_argument('-weights', type=str, default='../models/w.hdf5')
parser.add_argument('-sample_size', type=int, default=25)
args = parser.parse_args()
print 'Loading Word2vec'
nlp = English()
print 'Loaded word2vec features'
labelencoder = joblib.load('../models/labelencoder.pkl')
print 'Loading Model'
model = model_from_json(open(args.model).read())
print 'Loading Weights'
model.load_weights(args.weights)
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
print 'Loading VGGfeats'
vgg_model_path = '/home/arush/vgg_feats.mat'
features_struct = scipy.io.loadmat(vgg_model_path)
VGGfeatures = features_struct['feats']
print "Loaded"

X_q = nlp(question)
X_i = VGGfeatures

X = [X_q, X_i]

y_predict = model.predict_classes(X, verbose=0)
print labelencoder.inverse_transform(y_predict)

q = unicode(raw_input("Ask a question about the image:"))

main(q)

Here's the output:

Ask a question about the image:Is there a ball? Loading Word2vec Loaded word2vec features Loading Model Loading Weights Loading VGGfeats Loaded Traceback (most recent call last): File "own_image.py", line 48, in main(q) File "own_image.py", line 42, in main y_predict = model.predict_classes(X, verbose=0) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 502, in predict_classes proba = self.predict(X, batch_size=batch_size, verbose=verbose) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 493, in predict return self._predict_loop(self._predict, X, batch_size, verbose)[0] File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 249, in _predict_loop ins_batch = slice_X(ins, batch_ids) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 55, in slice_X return [x[start] for x in X] File "spacy/tokens/doc.pyx", line 94, in spacy.tokens.doc.Doc.getitem (spacy/tokens/doc.cpp:4758) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Can't figure out the problem

avisingh599 commented 8 years ago
timesteps = len(nlp(q))
X_q = get_questions_tensor_timeseries([q], nlp, timesteps)
arushk1 commented 8 years ago

Wait, okay makes sense. But I don't need to do the get_image_matrix for the vgg features right? Since I have only one feature in that mat file?

avisingh599 commented 8 years ago

I guess you should not. However, before you run, ensure that the datatype and the dimensions are exactly the same as they are in the demo_batch.py file.

arushk1 commented 8 years ago

They are the same I checked by manually inspecting the mat file. I get this error now

Traceback (most recent call last): File "own_image.py", line 51, in main(q) File "own_image.py", line 45, in main y_predict = model.predict_classes(X, verbose=0) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 502, in predict_classes proba = self.predict(X, batch_size=batch_size, verbose=verbose) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 493, in predict return self._predict_loop(self._predict, X, batch_size, verbose)[0] File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 251, in _predict_loop batch_outs = f(*ins_batch) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() ValueError: total size of new array must be unchanged Apply node that caused the error: Reshape{2}(<TensorType(float64, matrix)>, MakeVector{dtype='int64'}.0) Toposort index: 46 Inputs types: [TensorType(float64, matrix), TensorType(int64, vector)] Inputs shapes: [(1, 1), (2,)] Inputs strides: [(8, 8), (8,)] Inputs values: [array([[ 0.]]), array([ 1, 4096])] Outputs clients: [[Join(TensorConstant{1}, Subtensor{int64}.0, Reshape{2}.0)]]

Backtrace when the node is created: File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 439, in get_output return theano.tensor.reshape(X, new_shape)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

avisingh599 commented 8 years ago

X_i = np.reshape(VGGfeatures, (1, 4096))

arushk1 commented 8 years ago

It works with that. Thanks a lot!

arushk1 commented 8 years ago

Finalising the own image script for a PR. It doesn't work with the latest version of Keras API. Fix?

Ask a question: What color is the ground? Traceback (most recent call last): File "own-image.py", line 62, in main() File "own-image.py", line 55, in main X_i = np.reshape(VGGfeatures, (1, 4096)) File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 218, in reshape return reshape(newshape, order=order) ValueError: total size of new array must be unchanged

avisingh599 commented 8 years ago

I will upgrade to Keras 0.3 soon and figure out.

avisingh599 commented 8 years ago

BTW, the above seems like a numpy error, unrelated to Keras versions.

arushk1 commented 8 years ago

Yeah fixed it and submitted a PR.

avisingh599 commented 8 years ago

Can't see the PR.

arushk1 commented 8 years ago

Now?

avisingh599 commented 8 years ago

Merged.