dandelin / ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Apache License 2.0
1.38k stars 207 forks source link

Checkpoint file for VQA might be wrong #56

Open jia2lin3yuan1 opened 2 years ago

jia2lin3yuan1 commented 2 years ago

Hi Dr. Kim,

It seems the checkpoint of vqa is not the correct one.

I tried the evaluation with given checkpoint weights on dataset nlvr and vqa. Evaluation score on NLVR is same as the result in the paper, but the score for VQA on both test-dev and test-standard were around 15 (The paper's result is 87.44). So, we assume it might be a wrong checkpoint file was uploaded for VQA.

Thank you