Reproduce of checkpoints

Dear authors:

I download the checkpoints Model checkpoints (~17G) and evaluate the model using the following code:

python tools/run.py --tasks vqa --datasets m4c_textvqa --model m4c_split --config configs/vqa/m4c_textvqa/tap_refine.yml --save_dir save/m4c_split_refine_test --run_type val --resume_file save/finetuned/textvqa_tap_base_best.ckpt

I got the following results:

2022-03-24T11:13:42 INFO: m4c_textvqa: full val:, 41000/24000, val/total_loss: 7.9873, val/m4c_textvqa/m4c_decoding_bce_with_mask: 7.9873, val/m4c_textvqa/textvqa_accuracy: 0.4413

And I found an error prompt during the evaluation:

Token indices sequence length is longer than the specified maximum sequence length for this model (599 > 512). Running this sequence through the model will result in indexing errors

In my opinion, the accuracy should be 0.4991 as shown in the following table:

What's wrong with my operations? Is there something to do with the error I encounter?

By the way, when I use the OCR-CC checkpoints: save/finetuned/textvqa_tap_ocrcc_best.ckpt, the accuracy is 0.4934 (which should be 0.5471), and I found the same error as mentioned above.

The GPU and PyTorch version is as following:

2022-03-24T11:09:34 INFO: CUDA Device 0 is: Tesla V100-SXM2-16GB 2022-03-24T11:09:37 INFO: Torch version is: 1.4.0

Hope to get your response

Thanks

microsoft / TAP

Reproduce of checkpoints #15