Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.52k stars 241 forks source link

about the specific prompt strategies from llava-1.5 (otter-hd-8b) #307

Closed peiliu0408 closed 8 months ago

peiliu0408 commented 8 months ago

As mentioned in the paper, the specific prompt is inspired by LLaVA-1.5. I wonder if a specific prompt is appended to the end of the question, similar to the one used in the VQA task, such as "Answer the question using a single word or phrase."

Luodian commented 8 months ago

Yes it's the same as llava's strategy for academic data.

peiliu0408 commented 8 months ago

Thanks, but I discovered through the fine-tuning code that the data from the same dataset is not organized in a conversation format. Is there any ablation study on this aspect?