Question about the training code for content diffusion model

NVlabs / CMD

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition (ICLR 2024)

https://sihyun.me/CMD

Other

25 stars 0 forks source link

Question about the training code for content diffusion model #2

Closed Dorniwang closed 1 month ago

Dorniwang commented 2 months ago

Hi Sihyun,

This is an amazing work to me!

I have a question about your content diffusion model training code. It seems like no pretrained image diffusion model is used during your training? The code seems like train the content diffision model from scratch. Am I right?

sihyun-yu commented 1 month ago

Hi, the current codebase is based on pixel-level autoencoder and diffusion models on UCF-101, in which we do not use any pretrained models. But we fine-tuned stable-diffusion for content diffusion in t2v experiments and you can definitiely follow this strategy as well.