Why are there not bias terms in the CNN model?

da03 / Attention-OCR

Visual Attention based OCR

MIT License

1.11k stars 362 forks source link

Why are there not bias terms in the CNN model? #67

Open MBleeker opened 6 years ago

da03 commented 6 years ago

Sorry I didn't quite get it. What do you mean by bias terms?

MBleeker commented 6 years ago

Most convolutions in the literature are implemented as x^T w + b

For example in vgg16 a conv layer is defined as (in TF code):

conv = tf.nn.conv2d(bottom, filt, [1, 1, 1, 1], padding='SAME') conv_biases = self.get_bias(name) bias = tf.nn.bias_add(conv, conv_biases) relu = tf.nn.relu(bias)

Is there a reason that you are not using bias terms in your conv model? I assume that bias terms are not needed when you apply batch norm after a convolution

da03 commented 6 years ago

I see. Yes you're right, since we apply batch norm before ReLU's, anyway the features are recentered such that bias terms are not needed.