GlassyWing / nvae

An unofficial toy implementation for NVAE 《A Deep Hierarchical Variational Autoencoder》
Apache License 2.0
108 stars 21 forks source link

What is 'grid' in the decoder? #4

Closed uestcwangxiao closed 3 years ago

uestcwangxiao commented 3 years ago

I see in the decoder.py, you cat the grid with the latent representation z_rep in each layer of the decoder network. I don't quite understand what its function, either how to generate grid? Is it a position coding ? Would it have any effect if it was removed?

island99 commented 3 years ago

虽然我没理解作者产生grid的过程和原因,但是感觉grid可以对应论文里的可训练参数h?

uestcwangxiao commented 3 years ago

虽然我没理解作者产生grid的过程和原因,但是感觉grid可以对应论文里的可训练参数h?

但好像每一层都加了这个grid,而论文里只有在decoder第一层有这个h

GlassyWing commented 3 years ago

It's true , that's just POS Embeeding, due to the z is one vector, It needs to be expanded to the same size as the feature map. Now I have removed the avg pool layer, so the shape of z change to (b, z_dim, map_h, map_w), there is no need to add POS Embeeding, pull the latest code to see it.