Closed taishan1994 closed 5 months ago
https://github.com/QwenLM/Qwen/tree/main/recipes/finetune/deepspeed#settings-and-gpu-requirements
Not possible for 24GB * 4.
是8卡,不是4卡。因为看到qlora在一张80G显卡上,设置长度为4096,训练需要68.0G,所以通过qlora的方式能否在8*24G的机器上设置长度为4096来微调qwen-72b-chat。但是我是用qlora+zero2进行多GPU微调。发现在模型加载的时候就会报OOM,如果使用zero3,并把model_offord和optim_offord打开,模型可以正常加载,但是qlora和zero3又不能同时使用。所以想问下有什么方式可以做到我想要做的。
For ZeRO Stage 2, having each GPU capable of holding the entire model is a bare minimum requirement. Given that Qwen-72B-Chat-Int4 exceeds 40GB, trying to finetune a model of this scale using Q-LoRA with GPUs that only have 24GB (or even 48GB) of memory simply won't cut it.
了解了,谢谢您的回答。
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
No response
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response