gordicaleksa / Open-NLLB

Effort to open-source NLLB checkpoints.
MIT License
419 stars 37 forks source link

Reduce peak memory when using FSDP on 2+ GPUs #2

Open gordicaleksa opened 1 year ago

gordicaleksa commented 1 year ago

Figure out the peak memory issue with FSDP when running the 615 M parameter on 2 GPUs I linked here:

https://github.com/facebookresearch/fairseq/issues/5318