Dumb Question about DM vs LDM

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

MIT License

8.02k stars 1k forks source link

Dumb Question about DM vs LDM #205

Open fujistoo opened 1 year ago

fujistoo commented 1 year ago

Do they only differ between the use of VAE to encode the inputs into embedding (and the conditional input part)? So if I wanted to make this in latent space, I'd use wrap this whole thing within the VAE?

lucidrains commented 1 year ago

@sztoo yea i'm trying to figure that out too, but i don't see why it should not work out of the box for VAE latent embedding space, as it is zero mean and unit variance

i'm currently looking at naturalspeech2 and i'm confused why it is done post-quantization as opposed to Rombach et al. latent diffusion paper, where they do it before..

jS5t3r commented 1 year ago

it is here shortly explained. https://theaisummer.com/diffusion-models/