Mistral Prompt Template Deviation (Instructions / System Prompts)

schopra8 commented 4 weeks ago

It seems like "instructions" from VidChat2's dataset are used as "system prompts" for LLMs while the "questions" from the dataset are used as "instructions" for LLMs.

For Mistral, system prompts are handled the same as other instructions. Concretely, the system prompt is typically included within <INST></INST> tags -- <INST> {{system.prompt}} {{ prompt }} </INST>.

But looking at codebase, it seems like VidChat2 places "instructions" from the dataset outside the <INST></INST> tags -- {{system.prompt}} <INST><Image></Image></INST> {{answer}}</s>.

What's the intuition behind this?

Andy1621 commented 3 weeks ago

Good question!

Actually, it's a mistake since I ignore how Mistral adds system prompt, but I think it has a minor impact.

Based on my previous experiments, I think different prompts work after instruction tuning. For example, I try to add <Image></Image> into different locations, like <INST><Image></Image></INST> <INST>Question</INST> or <INST><Image></Image> Question</INST> . The interesting problem is that the second prompt works better for conversation, but the first prompt works better for most of QA evaluation 🤦

schopra8 commented 3 weeks ago

Good to know -- thank you for the clarification!

OpenGVLab / Ask-Anything

Mistral Prompt Template Deviation (Instructions / System Prompts) #184