chaoyi-wu / Finetune_LLAMA

简单易懂的LLaMA微调指南。
354 stars 34 forks source link

如何实现多节点fsdp #10

Open boyue-jiang opened 10 months ago

boyue-jiang commented 10 months ago

您好,我在论文中看到你们在pretrain阶段用32张卡训练。我想请问如何用trainer fsdp实现多节点训练呢。例如我想在2个节点16个A100上训练,应该怎么用trainer实现,模型是会切片分到16个gpu上吗?

fwyc0573 commented 8 months ago

I also met the similar problem. I want to use Pytorch's FSDP to train among muti nodes, but the process blocked. Is there any configuration or example i can follow?