about the prior distribution

naturomics / DLF

Code for reproducing results in "Generative Model with Dynamic Linear Flow"

https://arxiv.org/abs/1905.03239

70 stars 13 forks source link

about the prior distribution #6

Open yuffon opened 5 years ago

yuffon commented 5 years ago

In the top layer, the prior distribution here use h=conv(0) + embedding as the mean and std in the case of 'ycond=True'. It seems that the conv layer is unnecessary.

naturomics commented 5 years ago

In the top prior layer, the mean and logs are shared in spacial dimension in case of non-conditioning (ycond=False), meaning (mean, logs) = tensor(1, 1, 1, 2n). In the implementation, we set (mean, logs)=bias of that conv layer, and let the conv(0) broadcasts the bias to shape (batch_size, height, width, 2n). Nothing more than that. So you can replace it with (mean, logs)= tf.get_variable([1,1,1, 2*n]) and tf.tile() to get the right shape.

So, it's just a programming trick.

yuffon commented 5 years ago

I see that the code uses _rescale = tf.get_variable("rescale", [], initializer=tf.constant_initializer(1.)) scale_shift = tf.get_variable("scale_shift", [], initializer=tf.constant_initializer(0.)) logsd = tf.tanh(logsd) * rescale + scaleshift for mean and logstd. Is that necessary?

naturomics commented 5 years ago

It's for training stability, see the experiments section in our paper.