williamcfrancis / Visual-Question-Answering-using-Stacked-Attention-Networks

Pytorch implementation of VQA using Stacked Attention Networks: Multimodal architecture for image and question input, using CNN and LSTM, with stacked attention layer for improved accuracy (54.82%). Includes visualization of attention layers. Contributions welcome. Utilizes Visual VQA v2.0 dataset.
Apache License 2.0
5 stars 5 forks source link

Trained model availability? #2

Open shruum opened 2 months ago

shruum commented 2 months ago

Hi, thanks for the implementation. Do you have a trained model available to test it? Please let me know if I can download it. Thank you

williamcfrancis commented 2 months ago

I don't have the trained models anymore. You'll have to train it.

aditya-patil-00 commented 2 months ago

I don't have the trained models anymore. You'll have to train it.

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/vqa/dataset/Resized_Images/train2014/COCO_train2014_000000402655.jpg'

What images do I need to add here? This occurs at data_loader['train']

williamcfrancis commented 2 months ago

I don't have the trained models anymore. You'll have to train it.

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/vqa/dataset/Resized_Images/train2014/COCO_train2014_000000402655.jpg'

What images do I need to add here? This occurs at data_loader['train']

You'll need to set up the dataset. Follow the instructions in this repository- https://github.com/tbmoon/basic_vqa