aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
447 stars 149 forks source link

[torch-neuronx] FSDP support - Distributed Training on Trn1 #502

Open aws-rxgupta opened 2 years ago

aws-rxgupta commented 2 years ago

[torch-neuronx] FSDP support - Distributed Training on Trn1

gilinachum commented 1 year ago

Is there an ETA for this?

AWSNB commented 1 year ago

Gili

It is THE top priority on the roadmap, coming next few weeks

From: Gili Nachum @.> Reply-To: aws-neuron/aws-neuron-sdk @.> Date: Thursday, November 10, 2022 at 7:08 AM To: aws-neuron/aws-neuron-sdk @.> Cc: Subscribed @.> Subject: Re: [aws-neuron/aws-neuron-sdk] [torch-neuronx] FSDP support - Distributed Training on Trn1 (Issue #502)

Is there an ETA for this?

— Reply to this email directly, view it on GitHubhttps://github.com/aws-neuron/aws-neuron-sdk/issues/502#issuecomment-1310433113, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFTRWCNHGOEB24ORUP2XVWDWHUFVTANCNFSM6AAAAAARBQESNU. You are receiving this because you are subscribed to this thread.Message ID: @.***>

ngnatk commented 1 year ago

Hi, may I check if there are any updates on this?