AILab-CVC / CV-VAE

[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
https://ailab-cvc.github.io/cvvae/index.html
246 stars 8 forks source link

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

[Sijie Zhao](https://scholar.google.com/citations?user=tZ3dS3MAAAAJ) · [Yong Zhang*](https://yzhang2016.github.io/) · [Xiaodong Cun](https://vinthony.github.io/academic/) · [Shaoshu Yang]() · [Muyao Niu]() [Xiaoyu Li](https://xiaoyu258.github.io/) · [Wenbo Hu](https://wbhu.github.io/) · [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en) *Corresponding Authors

TL; DR: A video VAE for latent generative video models, which is compatible with pretrained image and video models, e.g., SD 2.1 and SVD

News

Usage

Dependencies

Video reconstruction

Download the model weight from Hugging Face

python3 cvvae_inference_video.py \
  --vae_path MODEL_PATH \
  --video_path INPUT_VIDEO_PATH \
  --save_path VIDEO_SAVE_PATH \
  --height HEIGHT \
  --width WIDTH 

😉 Citation

@article{zhao2024cvvae,
  title={CV-VAE: A Compatible Video VAE for Latent Generative Video Models},
  author={Zhao, Sijie and Zhang, Yong and Cun, Xiaodong and Yang, Shaoshu and Niu, Muyao and Li, Xiaoyu and Hu, Wenbo and Shan, Ying},
  journal={https://arxiv.org/abs/2405.20279},
  year={2024}
}