FlagOpen / FlagScale

FlagScale is a large model toolkit based on open-sourced projects.
Other
132 stars 40 forks source link

[Model] Add llava mlp converter and update three models config #197

Closed Caozhou1995 closed 3 weeks ago

Caozhou1995 commented 3 weeks ago

This PR adds mlp ckpt conversation from hf to megatron in the llava pretrain phase. An tp2 example of the ckpt conversation for the LLaVA Pretrain phase is shown below:

And combine them: cd FlagScale/megatron && PYTHONPATH=FlagScale/megatron:FlagScale/ python examples/multimodal/combine_state_dicts.py --input <vicuna_megatron_dir>/iter_0000001/mp_rank_00/model_optim_rng.pt <clip_megatron_dir>/state_dict_tp_0.pt <mlp_megatron_dir>/state_dict_tp_0.pt <vicuna_megatron_dir>/iter_0000001/mp_rank_01/model_optim_rng.pt <clip_megatron_dir>/state_dict_tp_1.pt <mlp_megatron_dir>/state_dict_tp_1.pt --prefixes language_model vision_model vision_projection language_model vision_model vision_projection --output <output_dir>/iter_0000001/mp_rank_00/model_optim_rng.pt <output_dir>/iter_0000001/mp_rank_01/model_optim_rng.pt && cd <output_idr> && echo "1" > latest_checkpointed_iteration.txt

This PR also updates three models config.