about training memory optimization

guoqincode / Open-AnimateAnyone

Unofficial Implementation of Animate Anyone

2.89k stars 233 forks source link

about training memory optimization #41

Closed zhangvia closed 8 months ago

zhangvia commented 8 months ago

In the README, you mentioned that you would optimize the training code using DeepSpeed and Accelerate. However, as far as I know, the DeepSpeed functionality integrated into the Accelerate library does not support multi-model training. Do you have any suggestions?

zhangvia commented 8 months ago

and From my test results, it appears that when resolution was set to 512, first-stage training cannot be conducted on a 40GB A100 even with a batch size of 1. Is this normal?

guoqincode commented 8 months ago

This is normal, 40G is too small for training.

zhangvia commented 8 months ago

This is normal, 40G is too small for training.

what about the first question?