Closed betterze closed 9 months ago
The original latent diffusion model utilized VQ (Vector Quantization) regularization. In contrast, the updated version of Stable Diffusion employs KL (Kullback-Leibler) regularization. Both VQ and KL autoencoders differ mainly in their quantization methods, yet share the same decoding mechanism. Therefore, for the purposes of our methodology, they are functionally equivalent.
We first use the original latent diffusion model as our baseline based on the VQ-reg. But the SD has a better performance, so we final use the SD based on KL-reg as our baseline.
Thx a lot for your explanation. I really appreciate it.
Dear Buxiangzhiren,
Thank you for sharing this great repo, I really enjoy your work.
If I understand correctly, according to section 3.1 in LDM, KL-reg and VQ-reg are two different methods to regularize the autoencoder. According to section 4, the author uses a VQ-reg autoencoder.
Your method improves the VQ-reg autoencoder. But in the readme, you mention you use the KL-regularized autoencoder. The KL and VQ autoencoders are not the same thing, right? In the diffuser code base for SD, they also use KL autoencoder. Could you clarify this?
Thank you for your help.
Best Wishes,
Zongze