vlfeat / matconvnet

MatConvNet: CNNs for MATLAB
Other
1.4k stars 753 forks source link

How to make the output regression value in the range of [0,1]? #475

Open luzhi opened 8 years ago

luzhi commented 8 years ago

I tried to implemented the regression loss function as follows:

function Y = vl_l2normloss(X,c,dzdy)
assert(numel(X) == numel(c));
% n = sizer(X,1) * size(X,2);
n = size(X,4);
if nargin <= 2
  Y = sum((X(:) - c(:)).^2) ./ (2*n);
else
  assert(numel(dzdy) == 1);
  Y = reshape((dzdy / n) * (X(:) - c(:)), size(X));
end

And I also modified second last layer of the Cifar-10 network structure as below:

lr = [.1 2] ;

% Define network CIFAR10-quick
net.layers = {} ;

% Block 1
net.layers{end+1} = struct('type', 'conv', ...
                           'weights', {{0.01*randn(5,5,3,32, 'single'), zeros(1, 32, 'single')}}, ...
                           'learningRate', lr, ...
                           'stride', 1, ...
                           'pad', 2) ;
net.layers{end+1} = struct('type', 'pool', ...
                           'method', 'max', ...
                           'pool', [3 3], ...
                           'stride', 2, ...
                           'pad', [0 1 0 1]) ;
net.layers{end+1} = struct('type', 'relu') ;

% Block 2
net.layers{end+1} = struct('type', 'conv', ...
                           'weights', {{0.05*randn(5,5,32,32, 'single'), zeros(1,32,'single')}}, ...
                           'learningRate', lr, ...
                           'stride', 1, ...
                           'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', ...
                           'method', 'avg', ...
                           'pool', [3 3], ...
                           'stride', 2, ...
                           'pad', [0 1 0 1]) ; % Emulate caffe

% Block 3
net.layers{end+1} = struct('type', 'conv', ...
                           'weights', {{0.05*randn(5,5,32,64, 'single'), zeros(1,64,'single')}}, ...
                           'learningRate', lr, ...
                           'stride', 1, ...
                           'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', ...
                           'method', 'avg', ...
                           'pool', [3 3], ...
                           'stride', 2, ...
                           'pad', [0 1 0 1]) ; % Emulate caffe

% Block 4
net.layers{end+1} = struct('type', 'conv', ...
                           'weights', {{0.05*randn(4,4,64,64, 'single'), zeros(1,64,'single')}}, ...
                           'learningRate', lr, ...
                           'stride', 1, ...
                           'pad', 0) ;
net.layers{end+1} = struct('type', 'relu') ;

% Block 5
net.layers{end+1} = struct('type', 'conv', ...
                           'weights', {{0.05*randn(1,1,64,1, 'single'), zeros(1,1,'single')}}, ...
                           'learningRate', .1*lr, ...
                           'stride', 1, ...
                           'pad', 0) ;
% Loss layer
net.layers{end+1} = struct('type', 'l2normloss') ;

But the output regression value (when testing) may be in the range outside [0,1]. I have no idea about why is this happened? Is there anything wrong in my setting? Thanks!

xuranzhao711 commented 8 years ago

I can't see why the regression output should be inside [0 1], it can be any number. If you do want the output bounded by 0 and 1, just add a sigmoid layer after the last conv layer.

bazilas commented 8 years ago

Having the range for regressed values from 0 to 1 facilitates the optimization.

This is a process that could be done inside the get_batch function or offline if the training data is precomputed.

luzhi commented 8 years ago

@xuranzhao711 I need a normalised [0,1] values of regression output, because this value, in my application, represents a area percentage of the target object. The initial idea of this project is to estimate the object area ratio (percentage of the image), if there is only a RGB image given as the input. Thanks for your comments!

@bazilas Thanks for your suggestions! I modified the last two blocks (block 4 and 5) with sigmoid function, which network structure is defined like this:

opts.networkType = 'simplenn' ; opts = vl_argparse(opts, varargin) ;

lr = [.1 2] ;

% Define network CIFAR10-quick net.layers = {} ;

% Define network CIFAR10-quick net.layers = {} ;

% Block 1 net.layers{end+1} = struct('type', 'conv', ... 'weights', {{0.01*randn(5,5,3,32, 'single'), zeros(1, 32, 'single')}}, ... 'learningRate', lr, ... 'stride', 1, ... 'pad', 2) ; net.layers{end+1} = struct('type', 'pool', ... 'method', 'max', ... 'pool', [3 3], ... 'stride', 2, ... 'pad', [0 1 0 1]) ; net.layers{end+1} = struct('type', 'relu') ;

% Block 2 net.layers{end+1} = struct('type', 'conv', ... 'weights', {{0.05*randn(5,5,32,32, 'single'), zeros(1,32,'single')}}, ... 'learningRate', lr, ... 'stride', 1, ... 'pad', 2) ; net.layers{end+1} = struct('type', 'relu') ; net.layers{end+1} = struct('type', 'pool', ... 'method', 'avg', ... 'pool', [3 3], ... 'stride', 2, ... 'pad', [0 1 0 1]) ; % Emulate caffe

% Block 3 net.layers{end+1} = struct('type', 'conv', ... 'weights', {{0.05*randn(5,5,32,64, 'single'), zeros(1,64,'single')}}, ... 'learningRate', lr, ... 'stride', 1, ... 'pad', 2) ; net.layers{end+1} = struct('type', 'relu') ; net.layers{end+1} = struct('type', 'pool', ... 'method', 'avg', ... 'pool', [3 3], ... 'stride', 2, ... 'pad', [0 1 0 1]) ; % Emulate caffe

% Block 4 % ------------------------------------------------------------------------- % net.layers{end+1} = struct('type', 'conv', ... % 'weights', {{0.05*randn(4,4,64,64, 'single'), zeros(1,64,'single')}}, ... % 'learningRate', lr, ... % 'stride', 1, ... % 'pad', 0) ;

net.layers{end+1} = struct('type', 'conv', ... 'weights', {{0.05*randn(4,4,64,512, 'single'), zeros(1,512,'single')}}, ... 'learningRate', lr, ... 'stride', 1, ... 'pad', 0) ; % ------------------------------------------------------------------------- % net.layers{end+1} = struct('type', 'relu') ; net.layers{end+1} = struct('type', 'sigmoid') ; % -------------------------------------------------------------------------

% Block 5 % ------------------------------------------------------------------------- % net.layers{end+1} = struct('type', 'conv', ... % 'weights', {{0.05_randn(1,1,64,10, 'single'), zeros(1,10,'single')}}, ... % 'learningRate', .1_lr, ... % 'stride', 1, ... % 'pad', 0) ;

net.layers{end+1} = struct('type', 'conv', ... 'weights', {{0.05_randn(1,1,512,1, 'single'), zeros(1,1,'single')}}, ... 'learningRate', .1_lr, ... 'stride', 1, ... 'pad', 0) ; % ------------------------------------------------------------------------- net.layers{end+1} = struct('type', 'sigmoid') ;

% Loss layer % ------------------------------------------------------------------------- % net.layers{end+1} = struct('type', 'softmaxloss') ; net.layers{end+1} = struct('type', 'sigmoidcrossentropyloss'); % -------------------------------------------------------------------------

In this network structure, I use the loss function from Umuguc's auto encoder code. What I wanted to do is to measure the Mean Squared Error between the predicted regression values and the labels. Besides, I also add a new "error" function in the cnn_train for this regression:

function err = error_mse_sigmoid(opts, labels, res) predictions = gather(res(end-1).x) ; error = sum((predictions(:) - labels(:)).^2) * 0.5; err = error;

Could you please have a look if this error function is OK for the regression problem? Thanks!

xuranzhao711 commented 8 years ago

I am also doing a neural network regression problem with MatConvNet. Your network structure looks fine. From some personal experience, sometimes it is better to place the sigmoid layer before the last conv layer, and the output of the last conv layer goes directly to the euclidean loss layer (just remove the last sigmoid layer in your network). Though in this way the final output is not hard limited to [0,1] but this structure often gives better performance.