Closed MachinicGlitch closed 2 years ago
Hi. First what do you mean finetuning VQA_X? We finetune the image captioning model on VQA_X. If I understand you correctly, you want to finetune the finetuned VQA_X model?
vqaX_test_annot_full.json
and vqaX_test_annot_exp.json
are just meant for evaluating the explanations with the COCO captioning toolkit, as this is the format it expects. But in the Train
, Val
and Test
data loader, the file that is loaded is vqaX_train.json
, vqaX_val.json
and vqaX_test.json
. These are the main files that will be loaded during training and validation.
Feel free to open this issue again if you have further questions!
Hello!
I am attempting to finetune the VQA_X model and am running into some confusions about the data required.
I currently have a dataset of images and captions prepared and formatted similar to vqaX_test_annot_full.json and vqaX_test_annot_exp.json with one-to-one image/annotation pairs along with information to the file path of the jpeg file for each image.
Do I also need to prepare an additional set of data formatted similar to vqaX_val.json & vqaX_test.json with answers, explanations, and the image_id and name in order to do finetune training on the model, or am I able to do so only with the dataset mentioned above?
Thanks