Different --model for using `textvqa_tap_ocrcc_best.ckpt`?

Hello devs,

Thank you for publishing this work, and for sharing these resources!

I was trying to run the evaluation code for TextVQA that is mentioned in the README. I can successfully run the following using textvqa_tap_base_best.ckpt

python tools/run.py --tasks vqa --datasets m4c_textvqa --model m4c_split --config configs/vqa/m4c_textvqa/tap_refine.yml --save_dir save/m4c_split_refine_test --run_type val --resume_file save/finetuned/textvqa_tap_base_best.ckpt

I believe this returns the results without the addition (OCR-CC) dataset. I think that checkpoint is saved under save/finetuned/textvqa_tap_ocrcc_best.ckpt

However, when I use the ocrcc checkpoint, it fails while loading the checkpoint...

2022-03-10T21:56:35 INFO: Loading datasets
2022-03-10T21:56:37 INFO: Fetching fastText model for OCR processing
2022-03-10T21:56:37 INFO: Loading fasttext model now from /usr1/home/ptejaswi/TAP/pythia/.vector_cache/wiki.en.bin
2022-03-10T21:56:47 INFO: Finished loading fasttext model
2022-03-10T21:56:50 INFO: CUDA Device 0 is: GeForce GTX TITAN X
2022-03-10T21:56:54 INFO: Torch version is: 1.8.1+cu101
2022-03-10T21:56:54 INFO: Loading checkpoint
2022-03-10T21:56:55 ERROR: Error(s) in loading state_dict for M4C:
    Missing key(s) in state_dict: "text_bert.encoder.layer.0.attention.self.query.weight", "text_bert.encoder.layer.0.attention.self.query.bias", "text_bert.encoder.layer.0.attention.self.key.weight", "text_bert.encoder.layer.0.attention.self.key.bias", "text_bert.encoder.layer.0.attention.self.value.weight", "text_bert.encoder.layer.0.attention.self.value.bias", "text_bert.encoder.layer.0.attention.output.dense.weight", "text_bert.encoder.layer.0.attention.output.dense.bias", "text_bert.encoder.layer.0.attention.output.LayerNorm.weight", "text_bert.encoder.layer.0.attention.output.LayerNorm.bias", "text_bert.encoder.layer.0.intermediate.dense.weight", "text_bert.encoder.layer.0.intermediate.dense.bias", "text_bert.encoder.layer.0.output.dense.weight", "text_bert.encoder.layer.0.output.dense.bias", "text_bert.encoder.layer.0.output.LayerNorm.weight", "text_bert.encoder.layer.0.output.LayerNorm.bias", "text_bert.encoder.layer.1.attention.self.query.weight", "text_bert.encoder.layer.1.attention.self.query.bias", "text_bert.encoder.layer.1.attention.self.key.weight", "text_bert.encoder.layer.1.attention.self.key.bias", "text_bert.encoder.layer.1.attention.self.value.weight", "text_bert.encoder.layer.1.attention.self.value.bias", "text_bert.encoder.layer.1.attention.output.dense.weight", "text_bert.encoder.layer.1.attention.output.dense.bias", "text_bert.encoder.layer.1.attention.output.LayerNorm.weight", "text_bert.encoder.layer.1.attention.output.LayerNorm.bias", "text_bert.encoder.layer.1.intermediate.dense.weight", "text_bert.encoder.layer.1.intermediate.dense.bias", "text_bert.encoder.layer.1.output.dense.weight", "text_bert.encoder.layer.1.output.dense.bias", "text_bert.encoder.layer.1.output.LayerNorm.weight", "text_bert.encoder.layer.1.output.LayerNorm.bias", "text_bert.encoder.layer.2.attention.self.query.weight", "text_bert.encoder.layer.2.attention.self.query.bias", "text_bert.encoder.layer.2.attention.self.key.weight", "text_bert.encoder.layer.2.attention.self.key.bias", "text_bert.encoder.layer.2.attention.self.value.weight", "text_bert.encoder.layer.2.attention.self.value.bias", "text_bert.encoder.layer.2.attention.output.dense.weight", "text_bert.encoder.layer.2.attention.output.dense.bias", "text_bert.encoder.layer.2.attention.output.LayerNorm.weight", "text_bert.encoder.layer.2.attention.output.LayerNorm.bias", "text_bert.encoder.layer.2.intermediate.dense.weight", "text_bert.encoder.layer.2.intermediate.dense.bias", "text_bert.encoder.layer.2.output.dense.weight", "text_bert.encoder.layer.2.output.dense.bias", "text_bert.encoder.layer.2.output.LayerNorm.weight", "text_bert.encoder.layer.2.output.LayerNorm.bias".

Do I need to change the --model argument passed to run.py? At the moment it is --model m4c_split. This is the command to reproduce the above error:

python tools/run.py --tasks vqa --datasets m4c_textvqa --model m4c_split --config configs/vqa/m4c_textvqa/tap_refine.yml --save_dir save/m4c_orcc_refine_test --run_type val --resume_file save/finetuned/textvqa_tap_ocrcc_best.ckpt

microsoft / TAP

Different --model for using `textvqa_tap_ocrcc_best.ckpt`? #14