Closed Darius-H closed 2 weeks ago
Hello! Were you able to understand the author's motivation to use Transformer based autoencoder?
Thanks in advance!
Hello! Were you able to understand the author's motivation to use Transformer based autoencoder?
Thanks in advance!
I tested this autoencoder, it works really bad, the reconstruction quality is very low.
Q1: As latent diffusion uses VAE, why did you modify the structure to autoencoder, is it because of poor VAE performance?
Q2: Why design a bottleneck structure here? https://github.com/sihyun-yu/PVDM/blob/17699659148423469c0d1ccdca5e466933b943e1/models/autoencoder/autoencoder_vit.py#L180C1-L190C34