vlfeat / autonn

A fast and expressive Matlab/MatConvNet deep learning API, with native automatic differentiation
89 stars 35 forks source link

Selective Weight Vector for Loss Func #39

Open ncalik opened 6 years ago

ncalik commented 6 years ago

Hi @jotaf98, how can we get max score indices in logits? I want to calculate a verification loss for a classified person. For example, let scr is logits of network , for a person, fea_vect is multiplied with corresponding vector W(maxInd , : )

[~,maxInd] = max( scr , [ ] , 3); loss_2 = (tanh(W(maxInd , : ) * fea_vect) - lbl).^2 where W = Param('value', randn(numPers,feaVectLength))

But I've got an error using max : Error using Layer/max Too many output arguments.

Also is W trainable over corresponding vectors ?

jotaf98 commented 6 years ago

Hi, you're right that the 2nd output is missing from the overloaded @max (this output is not differentiable so I didn't think of it). Anyway you can work around it, by creating a non-differentiable layer based on Matlab's @max with 2 outputs:

[~, maxInd] = Layer.create(@max, {scr , [ ] , 3}, 'numInputDer', 0)

(numInputDer is the number of input derivatives, which is 0 for non-differentiable functions) And yes, gradients will be back-propagated to the selected elements of W :)

ncalik commented 6 years ago

it's very cool solution!.. Firstly, I wrote a function like mcnExtraLayers :

function [y,dzdw ]= z_maxx(fea,w,scr,lbl,varargin) sz_fea = size(fea); sz_wei = size(w); fea_mat = squeeze(fea); scr_mat = squeeze(scr);

[~ dzdy] = vl_argparsepos(struct(), varargin) ; if isempty(dzdy) [~ maxInd] = max(scr(:)); val = 1./( 1 + exp( w(maxInd,:)*fea(:) ) ); y = (val-lbl).^2; else [~,maxInd] = max(scr(:)); dzdw = w; y = fea; % size(dzdw) % y = {dzdf,dzdw};

end end

then I called it by using : customLoss = Layer.fromFunction(@z_maxx,'numInputDer',2 ); I didn't write true derivatives, instead, I wrote same params to test function and it's worked. So, what is the differences between Layer.fromFunction and Layer.create? Can I use one instead of the other?

Also, I want to inform you about that defining derivatives like y = {dzdf,dzdw} can't work in eval mode. Also, vl_nnaxpy is in the same format, so I couldn't compile it. When I define derivatives as an output argument like [y,dzdw ] it's worked. I also mentioned this porblem in #31.

I have a one more question (I suppose that I am the one who bother you the most among in autonn users :)) ) in resnet we sum y = F(x)+x. Now I want to try a maxout operation on this way : F(x)'s channels : C1,C2,C3....CN x's channels G1,G2,G3,......GN new tensor y = C1,G1,C2,G2,C3,G3,....,CN,GN then I will use maxout [2 N], so how can I concat these two tensor in this format??

Many thanks !!