Closed jbdel closed 4 years ago
Hi,
the model that achieved 72.80% is trained with more powerful and complex visual features rather than the original bottom-up-attention features. Since these visual features are very large (up to about 250GB) which are not convenient for downloading, we do not provide them on the web. The details of the features used are introduced in our slides on VQA workshop website.
Hello.
First of all, thank you for this repo.
For the vqa challenge 2018, you manage to reach 72.80 % for the test-dev split. In the pretrained section, only a 70.82 accuracy model is available.
What do you think differs from both models that you lost 2%
Thank you very much in advance