question about parameter 'biascoef'

kklots commented 9 years ago

Hi Sergey,

There is new problem in my work. I found the parameter 'biascoef' may be not work well as described in the ReadME file.

This is my params and structure:

params.batchsize=128; params.epochs = 1; params.alpha = 0.1; params.momentum = 0.9; params.lossfun = 'logreg'; params.shuffle = 1; params.seed = 0; dropout = 0.5;

layers = { struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32 struct('type', 'c', 'filtersize', [3 3], 'outputmaps', 32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %15 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout',dropout) %7 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %3 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2 struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout, 'biascoef' ,0) struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0) };

I have set all the ‘biascoef’ of the convolution layers and the full-connection layers to 0, but the loss and test prediction accuracy are still change after each training epoch.

As far as I know, the network weights will not change when all ‘biascoef’ params are set to 0, but it seems inconsistent with the situation here.

Yours, Xuan Li.

sdemyanov commented 9 years ago

Hi, Xuan Li,

biascoef is just a coefficient that adapts learning rate for biases. If you set it to 0, it means that biases remain 0 all time. This is it.

Regards, Sergey.

On Fri, Jun 5, 2015 at 3:19 PM, lixuan notifications@github.com wrote:

Hi Sergey,

There is new problem in my work. I found the parameter 'biascoef' may be not work well as described in the ReadME file.

This is my params and structure:

params.batchsize=128; params.epochs = 1; params.alpha = 0.1; params.momentum = 0.9; params.lossfun = 'logreg'; params.shuffle = 1; params.seed = 0; dropout = 0.5;

layers = { struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32 struct('type', 'c', 'filtersize', [3 3], 'outputmaps', 32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %15 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout',dropout) %7 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %3 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2 struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout, 'biascoef' ,0) struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0) };

I have set all the biascoef of the convolution layers and the full-connection layers to 0, but the loss and test prediction accuracy are still change after each training epoch.

As far as I know, the network structure will not change when all ‘biascoef’ params are set to 0, but it seems inconsistent with the situation here.

Yours, Xuan Li.

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/20.

PhD candidate, Computing and Information Systems, The University of Melbourne.

Sergey Demyanov http://www.demyanov.net/

kklots commented 9 years ago

Thank you for your explanation. Is there any method to set a special learning rate for each layer ?

sdemyanov commented 9 years ago

No, but it should be very easy to do. Just take a look how biascoef works, and introduce another parameter for other weights.

On Fri, Jun 5, 2015 at 5:59 PM, lixuan notifications@github.com wrote:

Thank you for your explanation. Is there any method to set a special learning rate for each layer ?

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/20#issuecomment-109196311.

sdemyanov / ConvNet

question about parameter 'biascoef' #20