Aleph-Alpha / magma

MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website
MIT License
469 stars 55 forks source link

How are the n-shot VQA examples selected? #48

Open ys-zong opened 10 months ago

ys-zong commented 10 months ago

Hi, thanks for the nice work! For the n-shot VQA experiments, how did you select the support set examples? More specifically, did you use the same fixed set of n-shot demonstration examples for all queries, or for each query question, different n-shot demonstration examples are selected? Many thanks!