DAG Batch Normalization Parameter Initialization Issue

Nicholas-Schaub commented 7 years ago

I'm a little new to MatConvNet, so this may not be an issue but I thought I'd submit it anyway and see what shakes out.

I'm using beta23, and I insert a batch normalization layer like this: nn.addLayer('u4_bn',dagnn.BatchNorm(),{'u4_c2_x'},{'u4_bn_x'},{'filters' 'bias' 'moments'})

I run training using the dag network training included in the examples folder, and it gives an error that the multipliers don't have the same depth as the data. I looked at nn, and the parameters for filters, bias, and moments are empty. So, I changed the code to this:

nn.addLayer('u4_bn',dagnn.BatchNorm(),{'u4_c2_x'},{'u4_bn_x'},{'filters' 'bias' 'moments'})
f = nn.getParamIndex('filters');
b = nn.getParamIndex('bias');
m = nn.getParamIndex('moments');
nn.params(f).value = ones(64, 1, 'single') ;
nn.params(b).value = ones(64, 1, 'single') ;
nn.params(m).value = zeros(64, 2, 'single') ;
nn.params(f).weightDecay = 0 ;
nn.params(b).weightDecay = 0 ;
nn.params(m).weightDecay = 0 ;

I initialize the network and the params for filter, bias, and moments are still empty. I was able to narrow it down to 'nn.initParams()' seemed to clear the values I was setting. If I set the values of filters, bias, and moments after 'nn.initParams()', then things seem to run smooth. I haven't dug into the code for initParams, but there seems to be a bug here.

albanie commented 7 years ago

At the moment the constructor for BatchNorm takes an argument for the number of channels, so in the code above, you would change the initialisation to:

numChannels = 64 ;
layer = dagnn.BatchNorm('numChannels', numChannels) ;
nn.addLayer('u4_bn', layer, {'u4_c2_x'}, {'u4_bn_x'}, {'filters' 'bias' 'moments'}) ;

If numChannels is not supplied, it defaults to zero which leads to the empty parameters that you observed.

Thanks for pointing this out, I think we can modify the code to act more intuitively.

Nicholas-Schaub commented 7 years ago

Thanks for the response. This was really bugging me. I've changed my code accordingly.

I've noticed some other quirky things, but all in all I think your library is pretty intuitive. Thanks for the hard work you all put into this.

vlfeat / matconvnet

DAG Batch Normalization Parameter Initialization Issue #801