salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence
BSD 3-Clause "New" or "Revised" License
9.91k stars 971 forks source link

What is the prompt template used by InstructBlip_flant5xl on msvd_qa and msrvtt_qa? #333

Open tgyy1995 opened 1 year ago

tgyy1995 commented 1 year ago

I had some trouble reproducing InstructBlip model results on the msvd_qa and msrvtt_qa datasets. Could you please tell me what prompt template and hyperparameters were used for these datasets ? It would be nice if you could tell me prompt templates for all data sets. Thank you.

guozix commented 1 year ago

I'm trying to replicate the results of InstructBLIP on MSVDQA too. I noticed that appendix E in the InstructBLIP paper provide a rather brief prompt for MSVD and MSRVTT: " Question: {} Short answer:"

@tgyy1995 By the way, I wanna ask how to evaluate the results on MSVD. The label of MSVD seems to be one of 2423 options from qa_ans2label.json. Is the output of LLM considered to be correct only when it predicts the specific single word?

DoggyLu commented 1 year ago

hello,could you tell me how to reproduce InstructBlip model results on the msrvtt_qa datasets, for example,could you show me the eval setting yaml file

DarkLighter97 commented 1 year ago

hello,could you tell me how to reproduce InstructBlip model results on the msrvtt_qa datasets, for example,could you show me the eval setting yaml file

I have the same question on how to to reproduce InstructBlip on msrvtt_qa dataset, have you solved it?