hengyuan-hu / bottom-up-attention-vqa

An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
GNU General Public License v3.0
754 stars 181 forks source link

Run trained model on a single image #13

Closed brandonjabr closed 6 years ago

brandonjabr commented 6 years ago

I've successfully trained the model and can load the state dict from the .pth model into a new instance. Is there any way I can now test it on a new image/question, and see the response?

Thank you!

zengxianyu commented 6 years ago

@brandonjabr I think you'll need to run the resnet and faster-rcnn by yourself to get the feature of an image, then input this feature and your question to their model to obtain the result.

brandonjabr commented 6 years ago

@zengxianyu Thanks for your help, unfortunately I'm still a bit stuck, I tried making a dataset with just one image and corresponding .json files with a few questions for the image (using the COCO .json format). So far I've been able to feed the trained model questions from these custom files, which returns predictions as [1x3129] torch tensors.

How can I convert these tensors to the actual answer they represent as a sentence?

jnhwkim commented 6 years ago

@brandonjabr see the generated Python dictionary in data/cache/trainval_label2ans.pkl.