Thanks for the great question. We don't have the conclusive answer for which must be better. Some points for comments are:
We use AE for the "dimension reduction" purpose, to compress the complicated data structures in the high-dimensional non-Euclidean space into a low-dimensional (latent) Euclidean space.
We believe in the "simpler" low-dimensional latent space the generative model can learn easier.
VAE does not serve for the dimension reduction purpose under our motivation.
VAE might also work in another perspective: Conceptually, composing VAE and diffusion models together can be viewed as a deep hierarchical VAE, which still capture the distribution in the original space with a larger generative model.
Hi, Yuning,
Thanks for your excellent paper.
Have you tried to use VAE model instead of AE model? Can VAE model replace GSSL and generate better molecules?
Best,