Closed alidadsetan closed 2 weeks ago
python will do but torchrun may also work. At the moment of developing this code, FSDP was not mature and it is very similar to ZeRO3 in most aspects.
Thank you for the reply. I was not actually able to run it using either of these methods. I am trying to convert my codebase to deepspeed. Thanks again for the library, I hope I will be able to use it.
I am trying to adapt the FSDP extending module for a research project, because the rest of the project is written in FSDP and I wanted to keep the changes to the current project to minimal. Can you please provide a script for running the
examples/image_classification/ZERO_examples/CIFAR_TIMM_FSDP_extending.py
file? Do you run it withtorchrun
, or simply withpython
? Also, is there a reason why you do not use FSDP in your language model examples? I wanted to also do private training of language models, and I would appreciate if you can provide some examples for language modeling using FSDP. Many thanks for your help, and also for open sourcing this great library!