AILab-CVC / CV-VAE

[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
https://ailab-cvc.github.io/cvvae/index.html
246 stars 8 forks source link

Compatibility #1

Open inferno46n2 opened 5 months ago

inferno46n2 commented 5 months ago

Hello,

So this require a uniquely trained video model to run with? Did your team also finetune a SVD model to use with this?

Confused what type of video diffusion models accepts these compressed latents.

yzhang2016 commented 5 months ago

The tuned SVD model will be released soon. The original SVD can also be used, but its performance is worse than the tuned one.

frankchieng commented 5 months ago

how to generate more frames with SVD and CV-VAE?

sijeh commented 5 months ago

Hello,

So this require a uniquely trained video model to run with? Did your team also finetune a SVD model to use with this?

Confused what type of video diffusion models accepts these compressed latents.

Any downstream models obtained from SD1.5 and SD2.1, such as various image diffusion community models, as well as video models, such as SVD, Videocrafter, and Animatediff, can be used with CV-VAE. In addition, compatibility can be further enhanced by performing a small amount of fine-tuning on the diffusion model.

sijeh commented 5 months ago

The tuned SVD model will be released soon. The original SVD can also be used, but its performance is worse than the tuned one.

The inference code and model weights of SVD will be coming soon.

radna0 commented 4 months ago

Is it possible to fine-tune Open-Sora or Open-Sora-PLan with CV-VAE, Have your team tried comparing this with SVD? @sijeh

sijeh commented 4 months ago

Is it possible to fine-tune Open-Sora or Open-Sora-PLan with CV-VAE, Have your team tried comparing this with SVD? @sijeh

Open-Sora and Open-Sora-Plan are initialized from pixelart-alpha, which also uses the SD2.1 VAE. However, they later trained their own video VAE, and the latent space is no longer compatible with SD2.1.