[Feature request] DiVAE

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

MIT License

8.16k stars 1.01k forks source link

[Feature request] DiVAE #151

Open jordiae opened 1 year ago

jordiae commented 1 year ago

DiVAE [1] uses a VQ encoder and a diffusion decoder. Unfortunately, there's no public implementation. It would also be nice to combine that with diffusion Transformers [2].

Any way many thanks for all your work!

[1] https://arxiv.org/abs/2206.00386 [2] https://arxiv.org/abs/2212.09748

lucidrains commented 1 year ago

@jordiae i think SOTA for diffusion transformers would be Muse

i'll take a look at DiVAE this weekend, thanks!

jordiae commented 1 year ago

@jordiae i think SOTA for diffusion transformers would be Muse

i'll take a look at DiVAE this weekend, thanks!

The main difference is that in DiVAE the decoder of the image "tokenizer" is a diffusion model. Thanks!

Edit: This should be better than VQGAN (see Table 1 in https://arxiv.org/pdf/2206.00386.pdf)

lucidrains commented 1 year ago

@jordiae i think SOTA for diffusion transformers would be Muse i'll take a look at DiVAE this weekend, thanks!

The main difference is that in DiVAE the decoder of the image "tokenizer" is a diffusion model. Thanks!

Edit: This should be better than VQGAN (see Table 1 in https://arxiv.org/pdf/2206.00386.pdf)

oh my, it is like a frankenstein haha