lucidrains / DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
MIT License
5.55k stars 643 forks source link

about version of vq-vae #44

Open Eddie-Hwang opened 3 years ago

Eddie-Hwang commented 3 years ago

Hello, I'm wondering which vq-vae model you are using. Is it vq-vae-1? or 2 Thanks in advance

lucidrains commented 3 years ago

@Eddie-Hwang so OAI actually did not use VQ-VAE ( although I'll probably bring that into the repo eventually given the results in https://compvis.github.io/taming-transformers/ )

They used a discrete VAE with relaxation, which is hopefully what I have :)

lucidrains commented 3 years ago

i'll build in integration with VQ-VAE https://github.com/lucidrains/vector-quantize-pytorch by end of the week

lucidrains commented 3 years ago

@Eddie-Hwang I added VQVAE in 0.1.8 if you are interested in trying it!

fomalhautb commented 3 years ago

@Eddie-Hwang so OAI actually did not use VQ-VAE ( although I'll probably bring that into the repo eventually given the results in https://compvis.github.io/taming-transformers/ )

They used a discrete VAE with relaxation, which is hopefully what I have :)

Is the "discrete VAE" you mentioned in one of these papers?

lucidrains commented 3 years ago

@FomalhautB I'm actually not sure where it originated from, but I assume it started with this paper https://arxiv.org/pdf/1611.01144.pdf