fchollet / deep-learning-with-python-notebooks

Jupyter notebooks for the code samples of the book "Deep Learning with Python"
MIT License
18.59k stars 8.63k forks source link

2nd edition Chapter 12, VAE latent space sampler differences with 1st edition #212

Open ghylander opened 2 years ago

ghylander commented 2 years ago

In the 2nd edition notebook of chapter 12 part 4, the latent space sampling layer is defined as the following class:

class Sampler(layers.Layer):
    def call(self, z_mean, z_log_var):
        batch_size = tf.shape(z_mean)[0]
        z_size = tf.shape(z_mean)[1]
        epsilon = tf.random.normal(shape=(batch_size, z_size))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon

In the 1st edition, it's defined as the following fucntion:

def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], 2),
                                          mean=0., stddev=1.)
    return z_mean + K.exp(z_log_var) * epsilon

Most changes come from following best practices and API changes, but I do have the folowing doubt: Why is z_log_var multiplied by 0.5 in the 2nd edition class? Is it because a difference between tf.exp() and tf.keras.backend.exp()? Is it becasue reducing it by half yields better results? Any other reason?