About the evaluation code.

Q-Future / Q-Instruct

②[CVPR 2024] Low-level visual instruction tuning, with a 200K dataset and a model zoo for fine-tuned checkpoints.

https://q-future.github.io/Q-Instruct/

Other

205 stars 9 forks source link

Closed Yangr116 closed 11 months ago

Yangr116 commented 11 months ago

Hi, thanks for your insightful work, I would like to know why are you use the last logits to calculate score?

teowu commented 11 months ago

Hi Rui,

This is the logic as proposed by Q-Bench.

For Q-Instruct, as IQA is not the pre-trained goal for it, we follow this evaluation strategy.

Yangr116 commented 11 months ago

Thanks for your quick reply. According to this design, does that mean the score token equals the eos token?

teowu commented 11 months ago

Not actually. This is actually the first token MLLM responses after the pre-set starting words.

Yangr116 commented 11 months ago

Got it. Thanks!