About Table 2 BUTD+bal accuracy from which datasets

gqa-ood / GQA-OOD

GQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.

27 stars 1 forks source link

About Table 2 BUTD+bal accuracy from which datasets #1

Closed alice-cool closed 2 years ago

alice-cool commented 3 years ago

Dear scholar, I am glad to read your paper. But I have questions about the result you showed in table 2 in the paper.

I think the result is too high about 60%, maybe from some datasets

In Lxmert, I found the BUTD method only gets at most 52%

CorentinKervadec commented 2 years ago

Dear @alice-cool,

Thank you for your interest in our paper 😃

The score difference is explained by the fact that the two tables are not using the same GQA split. The scores in our Table 2 have been computed on the GQA validation split, while scores in Table 3 of LXMERT have been computed on the GQA testdev split.