I am curious about the analysis of mPLUG-Owl on VQA tasks with multiple choices.
Specifically, I am looking for an API that takes in an image, prompt, and a list of choices (List[str]) and outputs the choices with the highest probability.
You can just add your options into prompt, and use as an open-generation style. We will release mPLUG-Owl-2 recently, which is a better foundation model, and it can better support multiple choice questions.
Hi, Thanks for your great work.
I am curious about the analysis of mPLUG-Owl on VQA tasks with multiple choices. Specifically, I am looking for an API that takes in an image, prompt, and a list of choices (List[str]) and outputs the choices with the highest probability.
Like
Is there a good way to achieve the same functionality?