MILVLG / openvqa

A lightweight, scalable, and general framework for visual question answering research
Apache License 2.0
320 stars 64 forks source link

From 72.80% to 70.82% accuracy for single model (VQA-v2) #58

Closed jbdel closed 4 years ago

jbdel commented 4 years ago

Hello.

First of all, thank you for this repo.

For the vqa challenge 2018, you manage to reach 72.80 % for the test-dev split. In the pretrained section, only a 70.82 accuracy model is available.

What do you think differs from both models that you lost 2%

Thank you very much in advance

MIL-VLG commented 4 years ago

Hi,

the model that achieved 72.80% is trained with more powerful and complex visual features rather than the original bottom-up-attention features. Since these visual features are very large (up to about 250GB) which are not convenient for downloading, we do not provide them on the web. The details of the features used are introduced in our slides on VQA workshop website.