microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.62k stars 2.5k forks source link

How to use Kosmos-2 for RAVEN-IQ #1370

Open ZjjConan opened 10 months ago

ZjjConan commented 10 months ago

Describe Model: I am using Kosmos-2 for RAVEN-IQ from Kosmos-1.

Question:

  1. How to use instruction with multiple inputs for IQ evaluation ?
  2. Is there any detailed descriptions or example to show the prompt construction?
  3. I use the prompt presented in Kosmos-1 with the following details: Here are three images:[image]/exp-4/query-1.png[image]/exp-4/query-2.png[image]/exp-4/query-3.pngThe following image is:[image]/exp-4/answer-f.pngIt is correct?Yes I didn't get suitable results with the above template. I am not sure this is the correct way to connect instruction with multiple input images.

Thanks a lot.

jl-hk commented 6 months ago

Looking into this as well. Would like to know what is the right prompt to input multiple images?