IDEALLab / bezier-gan

Bézier Generative Adversarial Networks
MIT License
37 stars 21 forks source link

In gan.py, didn't get why the reconstructed latent code (q_logstd) is needed. #4

Closed Sarvagya2009 closed 2 years ago

Sarvagya2009 commented 3 years ago
            q = tf.layers.dense(x, 128)
            q = tf.layers.batch_normalization(q, momentum=0.9)#, training=training)
            q = tf.nn.leaky_relu(q, alpha=0.2)
            q_mean = tf.layers.dense(q, self.latent_dim) #gives latent code
            q_logstd = tf.layers.dense(q, self.latent_dim)
            q_logstd = tf.maximum(q_logstd, -16)
            # Reshape to batch_size x 1 x latent_dim
            q_mean = tf.reshape(q_mean, (-1, 1, self.latent_dim))
            q_logstd = tf.reshape(q_logstd, (-1, 1, self.latent_dim))
            q = tf.concat([q_mean, q_logstd], axis=1, name='predicted_latent') 

What is the logic behind using q_logstd? Specifically in line 175, why is q_logstd = tf.maximum(q_logstd, -16) done?

wchen459 commented 3 years ago

q_logstd is used for computing the Gaussian log likelihood loss (see here).

q_logstd = tf.maximum(q_logstd, -16) is to avoid numerical errors by preventing the standard deviation to be smaller than exp(-16). Because when computing the Gaussian log likelihood loss, the std is used as a denominator. The loss will explode when std is too small.