MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com
Hi, thanks for the nice work! For the n-shot VQA experiments, how did you select the support set examples? More specifically, did you use the same fixed set of n-shot demonstration examples for all queries, or for each query question, different n-shot demonstration examples are selected? Many thanks!
Hi, thanks for the nice work! For the n-shot VQA experiments, how did you select the support set examples? More specifically, did you use the same fixed set of n-shot demonstration examples for all queries, or for each query question, different n-shot demonstration examples are selected? Many thanks!