jotaf98 / curveball

Second-order optimiser for deep networks
76 stars 6 forks source link

Code bug in line 292 of CurveBall.m #2

Closed Yolanda-Gao closed 6 years ago

Yolanda-Gao commented 6 years ago

I was running the first example "training('basic-curveball', 'solver',CurveBall(), 'learningRate',1)" and found bugs from CurveBall.m line 292. Line "Hlx = p . x - p . px" gives bug as p and px don't have the same dimension.

The former 'if ismatrix(p) px = sum(p . x, 1); else % 4D tensor px = sum(p . x, 3); end' is not a right matrix multiplication of 'p'x'. According to the paper, we would like to have Hlx = p. x - p p' x;

I changed this part to: " Hlx = zeros(size(x), 'gpuArray'); ptmp = squeeze(p); xtmp = squeeze(x); Hlx(1,1,:,:)=ptmp.xtmp-ptmpptmp'*xtmp; " The code can run now. But during the running, the matrix get to singular or badly scaled. And after 10 epochs, the error rate is still around 88%. Will update after training finished.

So do you know what is the issue here? Am I doing it the right way?

Yolanda-Gao commented 6 years ago

Sorry the the duplicate post. Github gave some bug yesterday. The first post was not shown then.