THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B
Apache License 2.0
1.42k stars 77 forks source link

How do I change the task from chat to caption? #115

Open kukaiN opened 1 week ago

kukaiN commented 1 week ago

I read the finetune docs and it led me to this: https://huggingface.co/datasets/THUDM/CogVLM-SFT-311K

There's a section "Caption format for image description" and it seems like captioning was part of the dataset.

Is it possible to change the mode/task of the model to caption? Or am I limited to sending caption instructions using template_version='chat'?