Understanding stages of deepspeed zero integration

awslabs / fast-differential-privacy

Fast, memory-efficient, scalable optimization of deep learning with differential privacy

Apache License 2.0

100 stars 19 forks source link

I am using distributed training with FastDP and have questions about its integration with Deepspeed. This is my first time using Deepspeed, and I apologize if some of these questions are trivial:

Are all the three stages of Deepspeed zero necessary for distributed DPSGD training?
The image classification examples have two python files for stage 1 and stage 2and3. Are both of them private?
For stage 1, privacyengine.attach() is not recommended. Then, how is the dp_step() called?
The requirements file mentions older versions of torch and deepspeed. Have you tested them on any newer versions?

Thank you!

awslabs / fast-differential-privacy

Understanding stages of deepspeed zero integration #43