Closed lucasjinreal closed 1 month ago
What you mention has been supported https://github.com/BAAI-DCAI/Bunny/commit/21483e81a18fe593de24df9b6d6cbcb63a320479
We are super busy these days but the training strategy is scheduled to be released. Stay tuned.
Does the vit opened both in pretrain and sft?
The strategy only differs in the visual instruction tuning stage. And the vision tower was frozen under pre-training stage.
Does the llama3 instruct chat format support?
Bunny-Llama-3-8B-V was trained under "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: <image>\n{prompt} ASSISTANT:
" template.
You can try it to see whether it works well under the origin template.
Oh, that's violated the llama8b-instruct's template, doesn't it will harm original language ability?
We use the same template for all bunny models for consistency and convenience. The performance is also acceptable.
Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.
https://huggingface.co/BAAI/Bunny-Llama-3-8B-V/blob/4e0fcff4232b8d70f4f6737406327733af1186b2/config.json#L39
Looks like the internal codebase didn't actually opensource.