Question about LinearZeros and Conv2dZeros

chaiyujin / glow-pytorch

pytorch implementation of openai paper "Glow: Generative Flow with Invertible 1×1 Convolutions"

MIT License

507 stars 80 forks source link

Question about LinearZeros and Conv2dZeros #12

Closed darongliu closed 5 years ago

darongliu commented 5 years ago

Hi, Thanks for the nice pytorch implementation. In the module of LinearZeros and Conv2dZeros, an additional scale term torch.exp(self.logs*self.logscale_factor) is multiplicated. I found the same thing in the OpenAI tensorflow implementation, but I am not sure why. Thank you

chaiyujin commented 5 years ago

@darongliu Sorry about the delay, I was on my Chinese new year vacation. I followed OpenAI tensorflow implementation. I thought it's a trick for stable training.

FirstHandScientist commented 5 years ago

Hi, The code is nice. I have question about the Conv2dZeros implementation as well. Initialization: self.weight.data.zero()
self.bias.data.zero() makes the output of Conv2dZeros always zero. Why is it wanted to have zeros output, since training will not change zeros init? Thanks.

chaiyujin commented 5 years ago

@FirstHandScientist The weight and bias is supposed to be normally updated in training phase. Conv2dZeros is just a normal Conv2d layer with zero initialization described in paper 3.3 section.

We initialize the last convolution of each NN() with zeros, such that each affine coupling layer initially performs an identity function; we found that this helps training very deep networks.

FirstHandScientist commented 5 years ago

Thanks... I got it.