zengyan-97 / X2-VLM

All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
BSD 3-Clause "New" or "Revised" License
123 stars 10 forks source link

VQA generate form #11

Open mabravo641 opened 10 months ago

mabravo641 commented 10 months ago

Hi thanks for your work and public release of the code.

I have checked your code and I could not find the generate function of your model while using the VQA model. I want to be able to input new questions to the model and generate answers which are independent of the ground truth (which is what currently your VQA evaluation does et. model the task as ranking at inference). I have also checked your Captioning task, but this one does not receive any question for the generation of the caption. Therefore it has not been clear for me where to input the question or the merge question+image embedding for the generative task.

Could you please provide the generative function of your VQA model (with text_encoder and text_decoder) in a generative fashion?

Thank you, MA

zengyan-97 commented 8 months ago

Hi, do you still need it? I have the code, but I did not release it since I only use it for some private tasks.

JusQD commented 3 months ago

Hi, do you still need it? I have the code, but I did not release it since I only use it for some private tasks.

Do you have the instructions to just evaluate your model with other dataset? Thanks