How you fed the VQA v2 each question and answers list to VL T5 or VL BART ?

ylsung / VL_adapter

PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)

MIT License

204 stars 16 forks source link

How you fed the VQA v2 each question and answers list to VL T5 or VL BART ? #17

Open sanyalsunny111 opened 1 year ago

sanyalsunny111 commented 1 year ago

Hey Authors,

Thank you for the repo @ylsung .

Can you please explain a bit how you guys sent the question and paired answers to the model as each question has multiple answers from the code it seems you guys are selecting one random answer (https://github.com/ylsung/VL_adapter/blob/545fcbbdbbaec4c442de35567f6ae477ff4e8265/VL-T5/src/vqa_data.py#L229). Can you please explain your approach a bit?

ylsung commented 1 year ago

In order to let the model learn to generate all possible answers based on the questions, we choose to randomly select an answer from the answer list for simplicity. This approach also allows us to fully use the labels.