magic-research / magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
https://showlab.github.io/magicanimate/
BSD 3-Clause "New" or "Revised" License
10.42k stars 1.07k forks source link

Inference Time way higher with custom DensePose videos #62

Open NolenBrolen opened 10 months ago

NolenBrolen commented 10 months ago

I created DensePose videos with Detectron2 and vid2Densepose. The videos I made are 512x512 and about 4 seconds long, just like the DensePose videos provided in the repo. However, the inference times are like 5x longer. Could anyone try to create some clarification on why this might be the case?

chen-rn commented 10 months ago

Are you duplicating the huggingface space or running it locally?

NolenBrolen commented 10 months ago

I'm running it locally. I was accidentally using a 1024x1024 video after all. However, I noticed that there is also a huge difference in inference time between 4 second and 5 second video. I managed to render a 4 second in 5 minutes, but a 5 second video took 15 minutes

I have an NVIDIA GeForce RTX 3060 (12GB vRAM)

maocaixia commented 10 months ago

Different FPS? @NolenBrolen

NolenBrolen commented 10 months ago

Different FPS? @NolenBrolen

Actually good point. I checked and the 4 second and 5 second video I made have the same FPS (they're both 29.97FPS). But I now noticed that the demo videos they provide are 25FPS, although I doubt this 5FPS difference would account for such a big difference in rendering time since my rendering time for my 4second video and the 4 second videos from the demo are about the same

FurkanGozukara commented 10 months ago

yes as duration of video increases inference time increases huge and VRAM also increases