guoqincode / Open-AnimateAnyone

Unofficial Implementation of Animate Anyone
2.9k stars 233 forks source link

About multi-GPU training #31

Closed vesdas closed 9 months ago

vesdas commented 9 months ago

Thank you for your contributions! There are two questions below:

  1. I have observed that the training duration using two RTX 6000 ada GPUs exceeds the time it takes with a single GPU. Is this an expected phenomenon?
  2. I encountered the phenomenon of gradient explosion during the training process.
guoqincode commented 9 months ago
  1. To improve GPU utilization, you can change the Dataset to lmdb format or increase the resolution to 512,768.
  2. I have not encountered this phenomenon, you can try to use the latest training code, it will still appear, suggested to reduce the learning rate.