[Model] Add LLaVA model

This PR adds LLaVA1.5 yaml, user can run the following command to start a LLaVA training job: python run.py --config-path ./examples/llava/conf --config-name config

The learning rate related settings and model parameters are consistent with the paper, the paper link is: https://arxiv.org/pdf/2310.03744.

The other training hyper parameters are consistent with Megatron, due to DeepSpeed zero stage2 used in the paper but Megatron does not support it. The Megatron tutorial is https://github.com/FlagOpen/FlagScale/tree/main/megatron/examples/multimodal

FlagOpen / FlagScale

[Model] Add LLaVA model #183