zengyan-97 / X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
BSD 3-Clause "New" or "Revised" License
441 stars 51 forks source link

VQA: Limitations in questions and answers #25

Open fizahkhalid opened 1 year ago

fizahkhalid commented 1 year ago

I want my Finetuned VQA model to be able to answer questions is was not trained on before and similarly provides answers that does not exist in the original answer list (test json file answers in a list).

Is there a limitation to the kind of questions i can the model? If yes, how can I tweak the code to meet my needs?

zengyan-97 commented 1 year ago

Hi,

you need to modify the inference process of the VQA model.

do not use this to rank the candidate answers: https://github.com/zengyan-97/X-VLM/blob/master/models/model_vqa.py#L144

instead, you should make it a real generation. for example, you can refer to: https://github.com/zengyan-97/X-VLM/blob/master/models/model_captioning.py#L75