BAAI-DCAI / Bunny

A family of lightweight multimodal models.
Apache License 2.0
799 stars 61 forks source link

Can u opensource the training script which can reproduce llama3 result? #70

Closed lucasjinreal closed 1 month ago

lucasjinreal commented 2 months ago

https://huggingface.co/BAAI/Bunny-Llama-3-8B-V/blob/4e0fcff4232b8d70f4f6737406327733af1186b2/config.json#L39

Looks like the internal codebase didn't actually opensource.

Isaachhh commented 2 months ago

What you mention has been supported https://github.com/BAAI-DCAI/Bunny/commit/21483e81a18fe593de24df9b6d6cbcb63a320479

We are super busy these days but the training strategy is scheduled to be released. Stay tuned.

lucasjinreal commented 2 months ago

Does the vit opened both in pretrain and sft?

Isaachhh commented 2 months ago

The strategy only differs in the visual instruction tuning stage. And the vision tower was frozen under pre-training stage.

lucasjinreal commented 2 months ago

Does the llama3 instruct chat format support?

Isaachhh commented 2 months ago

Bunny-Llama-3-8B-V was trained under "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: <image>\n{prompt} ASSISTANT:" template.

You can try it to see whether it works well under the origin template.

lucasjinreal commented 2 months ago

Oh, that's violated the llama8b-instruct's template, doesn't it will harm original language ability?

Isaachhh commented 2 months ago

We use the same template for all bunny models for consistency and convenience. The performance is also acceptable.

Isaachhh commented 1 month ago

Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.