Closed darongliu closed 5 years ago
@darongliu Sorry about the delay, I was on my Chinese new year vacation. I followed OpenAI tensorflow implementation. I thought it's a trick for stable training.
Hi,
The code is nice. I have question about the Conv2dZeros implementation as well.
Initialization:
self.weight.data.zero()
self.bias.data.zero()
makes the output of Conv2dZeros always zero. Why is it wanted to have zeros output, since training will not change zeros init?
Thanks.
@FirstHandScientist The weight and bias is supposed to be normally updated in training phase. Conv2dZeros is just a normal Conv2d layer with zero initialization described in paper 3.3 section.
We initialize the last convolution of each NN() with zeros, such that each affine coupling layer initially performs an identity function; we found that this helps training very deep networks.
Thanks... I got it.
Hi, Thanks for the nice pytorch implementation. In the module of LinearZeros and Conv2dZeros, an additional scale term
torch.exp(self.logs*self.logscale_factor)
is multiplicated. I found the same thing in the OpenAI tensorflow implementation, but I am not sure why. Thank you