sdemyanov / ConvNet

Convolutional Neural Networks for Matlab for classification and segmentation, including Invariang Backpropagation (IBP) and Adversarial Training (AT) algorithms. Trained on GPU, require cuDNN v5.
240 stars 141 forks source link

question about parameter 'biascoef' #20

Open kklots opened 9 years ago

kklots commented 9 years ago

Hi Sergey,

There is new problem in my work. I found the parameter 'biascoef' may be not work well as described in the ReadME file.

This is my params and structure:

params.batchsize=128; params.epochs = 1; params.alpha = 0.1; params.momentum = 0.9; params.lossfun = 'logreg'; params.shuffle = 1; params.seed = 0; dropout = 0.5;

layers = { struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32 struct('type', 'c', 'filtersize', [3 3], 'outputmaps', 32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %15 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout',dropout) %7 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %3 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2 struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout, 'biascoef' ,0) struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0) };

I have set all the ‘biascoef’ of the convolution layers and the full-connection layers to 0, but the loss and test prediction accuracy are still change after each training epoch.

As far as I know, the network weights will not change when all ‘biascoef’ params are set to 0, but it seems inconsistent with the situation here.

Yours, Xuan Li.

sdemyanov commented 9 years ago

Hi, Xuan Li,

biascoef is just a coefficient that adapts learning rate for biases. If you set it to 0, it means that biases remain 0 all time. This is it.

Regards, Sergey.

On Fri, Jun 5, 2015 at 3:19 PM, lixuan notifications@github.com wrote:

Hi Sergey,

There is new problem in my work. I found the parameter 'biascoef' may be not work well as described in the ReadME file.

This is my params and structure:

params.batchsize=128; params.epochs = 1; params.alpha = 0.1; params.momentum = 0.9; params.lossfun = 'logreg'; params.shuffle = 1; params.seed = 0; dropout = 0.5;

layers = { struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32 struct('type', 'c', 'filtersize', [3 3], 'outputmaps', 32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %15 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout',dropout) %7 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6 struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %3 struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2 struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout, 'biascoef' ,0) struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0) };

I have set all the biascoef of the convolution layers and the full-connection layers to 0, but the loss and test prediction accuracy are still change after each training epoch.

As far as I know, the network structure will not change when all ‘biascoef’ params are set to 0, but it seems inconsistent with the situation here.

Yours, Xuan Li.

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/20.

PhD candidate, Computing and Information Systems, The University of Melbourne.

Sergey Demyanov http://www.demyanov.net/

kklots commented 9 years ago

Thank you for your explanation. Is there any method to set a special learning rate for each layer ?

sdemyanov commented 9 years ago

No, but it should be very easy to do. Just take a look how biascoef works, and introduce another parameter for other weights.

On Fri, Jun 5, 2015 at 5:59 PM, lixuan notifications@github.com wrote:

Thank you for your explanation. Is there any method to set a special learning rate for each layer ?

— Reply to this email directly or view it on GitHub https://github.com/sdemyanov/ConvNet/issues/20#issuecomment-109196311.