About training the 72 frames models.

Tencent / MimicMotion

High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

https://tencent.github.io/MimicMotion/

Other

1.95k stars 166 forks source link

About training the 72 frames models. #28

Closed jiangzhengkai closed 4 months ago

jiangzhengkai commented 4 months ago

Hi, in my case, the GPU memory usage already reaches 66GB with 16 frames at 1024 x 576 resolution and all convolutional parts checkpointed. What tricks do you use for training with 72 frames?

More checkpointing on TransformerTemporalModel and TransformerSpatioTemporalModel really helps a lot.

icedCoffe001 commented 4 months ago

Hi, where did you find the training code?

jiangzhengkai commented 4 months ago

@YuzhiChen001 I implement it in my own training code, you can refer to svd_xtend

bruinxiong commented 4 months ago

@jiangzhengkai Could you give more detailed information about training data ? Is there any download link ?

jiangzhengkai commented 4 months ago

@jiangzhengkai You need to collect videos by yourself whether from video platforms. As I know, there is no much public high-quality dancing videos.