模型加载 - Githubissues

HuangLK / transpeeder

train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism

Apache License 2.0

208 stars 18 forks source link

模型加载 #18

Closed zhhao1 closed 1 year ago

zhhao1 commented 1 year ago

您好，我看您在ParallelTransformerLayerPipe里增加了self.activation_checkpointing = activation_checkpointing,但是这个参数在llama模型里是没有的，加载llama的模型不会出错吗。我看在更新的代码中，是先把hf格式转化为deepspeed的格式，然后engine.load_checkpoint(model_args.init_ckpt, load_module_only=True)加载，这个地方加载的过程中会默认不加载吗？

HuangLK commented 1 year ago

self.activation_checkpointing是ParallelTransformerLayerPipe内部使用的，不会传给hf的llama