AILab-CVC / CV-VAE

[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
https://ailab-cvc.github.io/cvvae/index.html
246 stars 8 forks source link

No scaling factor? #4

Open Tord-Zhang opened 5 months ago

Tord-Zhang commented 5 months ago

When using the encoder, there is no need to apply a scaling factor? which is used in sd1.5 vae. latent = vae3d.encode(video).latent_dist.sample() In SD 1.5, it should be latent = vae3d.encode(video).latent_dist.sample().mul_(scaling_factor)

sijeh commented 5 months ago

The scale factor of CV-VAE is the same as that of SD2.1, both being 0.18215. The scale factor is used for the input and output of Unet, but it is not necessary to use the scale factor when only encoding and decoding images and videos.