About multi-GPU training

guoqincode / Open-AnimateAnyone

Unofficial Implementation of Animate Anyone

2.9k stars 233 forks source link

Closed vesdas closed 9 months ago

vesdas commented 9 months ago

Thank you for your contributions! There are two questions below:

I have observed that the training duration using two RTX 6000 ada GPUs exceeds the time it takes with a single GPU. Is this an expected phenomenon?
I encountered the phenomenon of gradient explosion during the training process.

guoqincode commented 9 months ago

To improve GPU utilization, you can change the Dataset to lmdb format or increase the resolution to 512,768.
I have not encountered this phenomenon, you can try to use the latest training code, it will still appear, suggested to reduce the learning rate.