FlagOpen / FlagScale

FlagScale is a large model toolkit based on open-sourced projects.
Other
167 stars 42 forks source link

[Model] Add LLaVA model #183

Closed Caozhou1995 closed 2 months ago

Caozhou1995 commented 3 months ago

This PR adds LLaVA1.5 yaml, user can run the following command to start a LLaVA training job: python run.py --config-path ./examples/llava/conf --config-name config

The learning rate related settings and model parameters are consistent with the paper, the paper link is: https://arxiv.org/pdf/2310.03744.

The other training hyper parameters are consistent with Megatron, due to DeepSpeed zero stage2 used in the paper but Megatron does not support it. The Megatron tutorial is https://github.com/FlagOpen/FlagScale/tree/main/megatron/examples/multimodal

image