hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
31.82k stars 3.91k forks source link

昇腾卡训练不支持offload #4146

Closed wangbing35 closed 2 months ago

wangbing35 commented 3 months ago

Reminder

System Info

使用昇腾卡训练大参数量模型,deepspeed stage 3 + offload模式下,提示RuntimeError: inplace tensor self must be NPU-Tensor

Reproduction

ds_z3_offload_config.json

Expected behavior

No response

Others

No response

huyz-git commented 3 months ago

https://github.com/microsoft/DeepSpeed/issues/5585 等 deepspeed 下一个 release,或者自己用 master 代码编译 deepspeed 。

wangbing35 commented 2 months ago

deepspeed升级到0.14.3,支持offload了