Open gordicaleksa opened 1 year ago
Figure out the peak memory issue with FSDP when running the 615 M parameter on 2 GPUs I linked here:
https://github.com/facebookresearch/fairseq/issues/5318
Figure out the peak memory issue with FSDP when running the 615 M parameter on 2 GPUs I linked here:
https://github.com/facebookresearch/fairseq/issues/5318