How to convert the MLLM response to Y/N labels?

BradyFU / Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

10.88k stars 721 forks source link

How to convert the MLLM response to Y/N labels? #86

Closed llyx97 closed 7 months ago

llyx97 commented 7 months ago

Hi, Thanks for sharing the great work! I have a question regarding the conversion of MLLM's responses into Y/N labels. Could you please provide more details on how this conversion process is implemented in MME?

BradyFU commented 7 months ago

Thanks for your attention on our work. We use the instruction of "Please answer yes or no."

llyx97 commented 7 months ago

Thanks for the response.

However, the MLLMs may not strictly follow the instructions to answer "yes" or "no". For example, the MLLM may respond "The photo is taken inside a greenhouse, ..." (taken from Fig.4 in the MME paper). Could you explain more on how to convert such responses into Y/N labels?

Best regards,

BradyFU commented 7 months ago

In such a case, the model can not follow the simple instruction, and thus we judge the model delivers a wrong answer. Thank you.

llyx97 commented 7 months ago

Got it. Many thanks.