Open lbwang2006 opened 8 months ago
Hi, the released weight is trained in one stage. As said in the magic animate paper, we need more images to train the temporal module in the second stage to avoid image fidelity degradation.
As for the training data, we crawl some dance videos from Bilibili, and I am afraid we do not own the rights to distribute them.
Hi, the released weight is trained in one stage. As said in the magic animate paper, we need more images to train the temporal module in the second stage to avoid image fidelity degradation.
As for the training data, we crawl some dance videos from Bilibili, and I am afraid we do not own the rights to distribute them.
deepspeed fp16 training, loss is easy overflow, have you faced this problem?
the model is trained once? not two stage? BTW could you share the 2000 dance video dataset,thx