OpenBMB / BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models
Apache License 2.0
548 stars 74 forks source link

Can BMTrain work with Megatron-LM? #97

Closed marscrazy closed 1 year ago

marscrazy commented 1 year ago

We want to try a large LM model (>30B). Are there any examples to do that?

Achazwl commented 1 year ago

ModelCenter, based on bmtrain, support LLaMA by first converting checkpoint files using (https://github.com/OpenBMB/ModelCenter/blob/main/transfer/hugLLaMa_bmtrainLLaMa.py) then inference (https://github.com/OpenBMB/ModelCenter/blob/main/tests/test_llama.py) of finetune (https://github.com/OpenBMB/ModelCenter/blob/main/examples/llama/finetune_llama.py).

If you want pre-training, you can write a new model config and a new model based on those layers in modelcenter