CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2012.09841
MIT License
5.83k stars 1.15k forks source link

Confusion about the dimension of latent vectors #249

Open LiJiahao-Alex opened 6 months ago

LiJiahao-Alex commented 6 months ago

Hi.

There is one output in the decoder class: print("Working with z of shape {} = {} dimensions.".format(self.z_shape, np.prod(self.z_shape))) Here the letter z seems to represent the latent vector by convention. But actually in the encoder class h stands for latent vector. What I don't understand is: why is z the bottleneck, but h is used as the latent vector?

Looking forward to your response Best