关于batch_norm层中的影子变量

shuida commented 6 years ago

您的代码非常漂亮，特别值得我们学习。由于项目需要，我将您的代码做了一些修改，实现卷积层的时候没有用tf.contrib.slim，直接用的tf.nn.conv2d。我在卷积网络中加入batch_norm层，执行ema.apply()的时候，预期得到的影子变量名称为 <tf.Variable 'my_convolutional_alexnet/conv1/BatchNorm/moments/Squeeze/ExponentialMovingAverage:0' >

但是实际上得到的是 <tf.Variable 'my_convolutional_alexnet/conv1/BatchNorm/train/my_convolutional_alexnet/conv1/BatchNorm/moments/Squeeze/ExponentialMovingAverage:0' > 也就是说，名称中多出来“train/my_convolutional_alexnet/conv1/BatchNorm/”这一串。

代码中多个作用域分散在多个函数中，经过函数调用形成如下的嵌套关系：

with tf.name_scope('train'):
    with tf.variable_scope('my_convolutional_alexnet'):
        with tf.variable_scope('conv1'):
            with tf.variable_scope('BatchNorm'):
                pass

函数batch_norm()定义在convolutional_alexnet.py模块中，如下：

def batch_norm(x, is_training, name='BatchNorm', moving_decay=0.99, eps=1e-5):
    shape = x.get_shape().as_list()
    assert len(shape) in [2, 4]
    param_shape = shape[-1]
    with tf.variable_scope(name) as scope:
        gamma = tf.get_variable('gamma', param_shape, initializer=tf.constant_initializer(1))
        beta = tf.get_variable('beat', param_shape, initializer=tf.constant_initializer(0))
        axes = list(range(len(shape) - 1))
        batch_mean, batch_var = tf.nn.moments(x, axes, name='moments')
        ema = tf.train.ExponentialMovingAverage(moving_decay)
        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])> 
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)
    mean,var=tf.cond(tf.equal(is_training,True),mean_var_with_update,lambda(ema.average(batch_mean), ema.average(batch_var)))
    normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, eps)
return normed

请教一下，为什么执行ema.apply()的时候，生成的变量名会多出“train/my_convolutional_alexnet/conv1/BatchNorm/”这一部分？如何解决这个问题？

bilylee commented 5 years ago

Hi,

Is there any particular reasons to avoid using slim? In my experience, reimplementing batch_norm can be tricky and buggy.

shuida commented 5 years ago

Because I have to change the convolution, it's necessary to breakdown it into conv, bias, relu, batchnorm, pooling, etc. I have solved the problem. So I can close the issue. Thank you a lot!

bilylee / SiamFC-TensorFlow

关于batch_norm层中的影子变量 #63