torch / torch7

http://torch.ch
Other
9k stars 2.38k forks source link

热regularization #1083

Open linhanxiao opened 7 years ago

linhanxiao commented 7 years ago

@michaelauli @jgehring the following is my loss regularization code,I directly added the penal item to crit.output,is it right for model regularization? net:forward(sample.input) crit:forward(net.output, sample.target) local A for _,b in ipairs(_G.model.selfattentivesoftmax) do A=b.output end local B=A:clone() for i=1,B:size(1) do A=B[i]:clone() local AAT=torch.mm(A,A:t()) local I=torch.eye(A:size(1)) local P=torch.norm( AAT - I, 2 ) local penal=PP penal = penal/A:size(2) crit.output=crit.output+_G.model.selfattentivelamdapenal end crit:backward(net.output, sample.target) net:backward(sample.input, crit.gradInput)

in the document, -- Loss: f = f + opt.coefL1 norm(parameters,1) f = f + opt.coefL2 norm(parameters,2)^2/2

        -- Gradients:
        gradParameters:add( sign(parameters):mul(opt.coefL1) + parameters:clone():mul(opt.coefL2) )

but my regularization is not L1,L2 regularization.in my regularization code above the A is one network layer output ,is not the whole parameters,so what should i do to write right regularization code