HuangLK / transpeeder

train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Apache License 2.0
208 stars 18 forks source link

模型加载 #18

Closed zhhao1 closed 1 year ago

zhhao1 commented 1 year ago

您好,我看您在ParallelTransformerLayerPipe里增加了self.activation_checkpointing = activation_checkpointing,但是这个参数在llama模型里是没有的,加载llama的模型不会出错吗。 我看在更新的代码中,是先把hf格式转化为deepspeed的格式,然后engine.load_checkpoint(model_args.init_ckpt, load_module_only=True)加载,这个地方加载的过程中会默认不加载吗?

HuangLK commented 1 year ago

self.activation_checkpointingParallelTransformerLayerPipe内部使用的,不会传给hf的llama