It seems the checkpoint of vqa is not the correct one.
I tried the evaluation with given checkpoint weights on dataset nlvr and vqa. Evaluation score on NLVR is same as the result in the paper, but the score for VQA on both test-dev and test-standard were around 15 (The paper's result is 87.44). So, we assume it might be a wrong checkpoint file was uploaded for VQA.
Hi Dr. Kim,
It seems the checkpoint of vqa is not the correct one.
I tried the evaluation with given checkpoint weights on dataset nlvr and vqa. Evaluation score on NLVR is same as the result in the paper, but the score for VQA on both test-dev and test-standard were around 15 (The paper's result is 87.44). So, we assume it might be a wrong checkpoint file was uploaded for VQA.
Thank you