chingweitseng / general-deep-image-completion

A Deep Image Completion Model for Recovering Various Corrupted Images
70 stars 20 forks source link

Confused by the code of batch normalization #2

Closed SunnerLi closed 6 years ago

SunnerLi commented 6 years ago

Recently I try to re-implement about the idea of this paper. However, something I'm not fully understand. The following lists the part of batch normalization:

def batchnorm(self, bottom, is_train, epsilon=1e-8, name=None):
        bottom = tf.clip_by_value( bottom, -100., 100.)
        depth = bottom.get_shape().as_list()[-1]

        with tf.variable_scope(name):

            gamma = tf.get_variable("gamma", [depth], initializer=tf.constant_initializer(1.))
            beta  = tf.get_variable("beta" , [depth], initializer=tf.constant_initializer(0.))

            batch_mean, batch_var = tf.nn.moments(bottom, [0,1,2], name='moments')
            ema = tf.train.ExponentialMovingAverage(decay=0.5)

            def update():
                with tf.control_dependencies([ema_apply_op]):
                    return tf.identity(batch_mean), tf.identity(batch_var)

            ema_apply_op = ema.apply([batch_mean, batch_var])
            ema_mean, ema_var = ema.average(batch_mean), ema.average(batch_var)
            mean, var = tf.cond(
                    is_train,
                    update,
                    lambda: (ema_mean, ema_var) )

            normed = tf.nn.batch_norm_with_global_normalization(bottom, mean, var, beta, gamma, epsilon, False)
        return normed

I think this part are modified from here. The thing I confuse is: Why should we clip the value of tensor at first? Since the definition of usual batch normalization doesn't clip it, and the paper of LSGAN didn't clip too. Is something wrong about my understanding??

chingweitseng commented 6 years ago

Hi,

The codes in the graph are borrowed from the TensorFlow version of Context Encoder. Because of leveraging DCGAN in the encoder-decoder structure, they clipped the value in the batch-norm layer. The reasons they did this are trying to avoid collapse mode and unstable training while dealing with adversarial training

In our implementation, we merely applied the idea of adversarial loss function proposed from LSGAN, which gives us stable and good results on the inpainting problems.

SunnerLi commented 6 years ago

@adamstseng Thanks for your reply! I think I can get the idea about the trick.