bilylee / SiamFC-TensorFlow

A TensorFlow implementation of the SiamFC tracker
MIT License
363 stars 112 forks source link

关于batch_norm层中的影子变量 #63

Closed shuida closed 5 years ago

shuida commented 6 years ago

您的代码非常漂亮,特别值得我们学习。由于项目需要,我将您的代码做了一些修改,实现卷积层的时候没有用tf.contrib.slim,直接用的tf.nn.conv2d。 我在卷积网络中加入batch_norm层,执行ema.apply()的时候,预期得到的影子变量名称为 <tf.Variable 'my_convolutional_alexnet/conv1/BatchNorm/moments/Squeeze/ExponentialMovingAverage:0' >

但是实际上得到的是 <tf.Variable 'my_convolutional_alexnet/conv1/BatchNorm/train/my_convolutional_alexnet/conv1/BatchNorm/moments/Squeeze/ExponentialMovingAverage:0' > 也就是说,名称中多出来“train/my_convolutional_alexnet/conv1/BatchNorm/”这一串。

代码中多个作用域分散在多个函数中,经过函数调用形成如下的嵌套关系:

with tf.name_scope('train'):
    with tf.variable_scope('my_convolutional_alexnet'):
        with tf.variable_scope('conv1'):
            with tf.variable_scope('BatchNorm'):
                pass

函数batch_norm()定义在convolutional_alexnet.py模块中,如下:

def batch_norm(x, is_training, name='BatchNorm', moving_decay=0.99, eps=1e-5):
    shape = x.get_shape().as_list()
    assert len(shape) in [2, 4]
    param_shape = shape[-1]
    with tf.variable_scope(name) as scope:
        gamma = tf.get_variable('gamma', param_shape, initializer=tf.constant_initializer(1))
        beta = tf.get_variable('beat', param_shape, initializer=tf.constant_initializer(0))
        axes = list(range(len(shape) - 1))
        batch_mean, batch_var = tf.nn.moments(x, axes, name='moments')
        ema = tf.train.ExponentialMovingAverage(moving_decay)
        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])> 
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)
    mean,var=tf.cond(tf.equal(is_training,True),mean_var_with_update,lambda(ema.average(batch_mean), ema.average(batch_var)))
    normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, eps)
return normed

请教一下,为什么执行ema.apply()的时候,生成的变量名会多出“train/my_convolutional_alexnet/conv1/BatchNorm/”这一部分?如何解决这个问题?

bilylee commented 5 years ago

Hi,

Is there any particular reasons to avoid using slim? In my experience, reimplementing batch_norm can be tricky and buggy.

shuida commented 5 years ago

Because I have to change the convolution, it's necessary to breakdown it into conv, bias, relu, batchnorm, pooling, etc. I have solved the problem. So I can close the issue. Thank you a lot!