New model MimicMotion_1-1 very heavy!

Tencent / MimicMotion

High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

https://tencent.github.io/MimicMotion/

Other

1.93k stars 165 forks source link

New model MimicMotion_1-1 very heavy! #27

Open SlimeVRX opened 4 months ago

SlimeVRX commented 4 months ago

I am inferring a video consisting of 72 frames with default parameters, but the processing time has increased significantly!

It takes 3h17p for 72 frames (RTX 3090 24GB)

ak01user commented 4 months ago

yes,using a single 4090 gpu is also like this,I replace the 72 to 32,even worse than old ckpt.maybe I did something wrong.

gujiaxi commented 4 months ago

@SlimeVRX @ak01user The sample video we provide is a 35-second long video. You can try testing with a shorter video to obtain the expected results within an acceptable time.

zyayoung commented 4 months ago

You can verify if the model is utilizing the GPU effectively. The expected inference speed of a 72 frame denosing step is less than 5 seconds per iteration on an A100 GPU. However, your reported speed of 251 seconds per iteration appears unusually slow for a 3090 GPU.

If the VRAM requirement is too high, hope this will help: https://github.com/Tencent/MimicMotion/issues/21#issuecomment-2213978415

ak01user commented 4 months ago

@gujiaxi sorry,I just tested a 15 sec long video,It will spend 3h30m.It is too long.

SlimeVRX commented 4 months ago

I tried the old model (MimicMotion_1) again and identified that the cause is not the model but the number of frames,

which is 72 frames (RTX 3090 24GB)

Done Pre-process data! 2%|████▌ | 1/50 [03:47<3:05:29, 227.14s/it]
2%|████▌ | 1/50 [04:26<3:37:50, 266.75s/it]

ThatDocBoi commented 4 months ago

Are you sure you're not spilling into system RAM? What resolution & number of frames are you doing?

SlimeVRX commented 4 months ago

Yes, 11.2/15.9 share GPU memory this may be the cause of slowness

ckpt_path: models/MimicMotion_1-1.pth num_frames: 72 resolution: 576

Updated the last commit!

ThatDocBoi commented 4 months ago

To prevent this from happening, go to nvidia control panel, under "manage 3d settings" and under "CUDA - System fallback policy" change that setting to "prefer no sysmem fallback" so now you'll just get out of memory errors vs waiting 5 hours.

zyayoung commented 4 months ago

This issue is indeed attributed to the system's RAM or out-of-memory (OOM) errors. I am actively working on minimizing the VRAM requirements. I anticipate that the 72-frame model will perform efficiently on 4090.

Minamiyama commented 4 months ago

after sync with the latest PR #32 of yesterday night, I also got a long time waiting and large vram size on 4090, ran using python inference.py --inference_config configs/test.yaml

zyayoung commented 4 months ago

after sync with the latest PR #32 of yesterday night, I also got a long time waiting and large vram size on 4090

@Minamiyama This seems strange. One difference is that I tested on ubuntu, but I don't have a Windows machine with a 4090 GPU for testing. Could you try the following setting?

To prevent this from happening, go to nvidia control panel, under "manage 3d settings" and under "CUDA - System fallback policy" change that setting to "prefer no sysmem fallback" so now you'll just get out of memory errors vs waiting 5 hours.

Or does anyone else using windows/ubuntu can or cannot run the 72 frame model with 16G VRAM?

Minamiyama commented 4 months ago

after sync with the latest PR #32 of yesterday night, I also got a long time waiting and large vram size on 4090

@Minamiyama This seems strange. One difference is that I tested on ubuntu, but I don't have a Windows machine with a 4090 GPU for testing. Could you try the following setting?

To prevent this from happening, go to nvidia control panel, under "manage 3d settings" and under "CUDA - System fallback policy" change that setting to "prefer no sysmem fallback" so now you'll just get out of memory errors vs waiting 5 hours.

Or does anyone else using windows/ubuntu can or cannot run the 72 frame model with 16G VRAM?

yes, as your mention, I put it to docker then, it finally ran totally fine, just using 17 minutes 😄，thx very much

ak01user commented 4 months ago

Yes,I can run success in 72 frames with new model,but,it has the same results as the old model and has not improved.

CanvaChen commented 4 months ago

| 40/1325 [04:06<2:12:39, 6.19s/it]

It will spend 2h16m on my 3060. It is too long

akk-123 commented 4 months ago

@zyayoung 1.1的效果比1要更差

zyayoung commented 4 months ago

@akk-123 You need to set num_frames to 72 for the 1.1 model. The only difference between 1 and 1.1 is the number of frames per segemnt used during training. If you determine that version 1 performs better for your needs, you have the option to use it instead.