patrikhuber / superviseddescent

C++11 implementation of the supervised descent optimisation method
http://patrikhuber.github.io/superviseddescent/
Apache License 2.0
402 stars 188 forks source link

Bug suspicion #59

Open edubois opened 5 years ago

edubois commented 5 years ago

Hi Patrik,

I see something that looks a bit strange to me: https://github.com/patrikhuber/superviseddescent/blob/0179b4b69e3bc6e2bfdae51f3169414444290d63/include/superviseddescent/regressors.hpp#L135

The lambda gets updated, but don't you want this method to be const and use a local copy of the lambda value? Otherwise we are changing the datamember at each call to solve. This might mean it's going smaller and smaller.

Just want to make sure. What do you think?

patrikhuber commented 5 years ago

Hi Eloi,

Hmm! That's a good question. Just looking at the code /control flow, think I would actually agree with you and say that you've indeed discovered a bug. Nice! Just to make sure, I would step through the code in the debugger with a trivial example and observe the value of lambda and the regularisation matrix, and see whether what we think is happening from just reading the code is really happening "in practice" too.

If it does indeed decrease the lambda at each call to solve, it has been quite a while since I've looked at the original paper, but indeed I think that that's not what we want to do.

It's an interesting one - it would basically mean we've unknowingly decreased the regularisation further in each regressor. But that makes me think, in the RCR training, we set the lambda per-regressor: https://github.com/patrikhuber/superviseddescent/blob/0179b4b69e3bc6e2bfdae51f3169414444290d63/apps/rcr/rcr-train.cpp#L439-L443 And I think solve() is only ever called once per regressor level. Which would mean that this bug actually never occurs in practice, at least not until someone changes the code and uses it in a way that calls solve() more than once per regressor. It's really hard to tell, so as mentioned above, the best way would be to run the code and see whether what I think is correct :-)

edubois commented 5 years ago

Hi Patrick,

Thanks for your comments,

Actually I think your code is right for a bit of a tricky reason: the regularizer passed at this point: https://github.com/patrikhuber/superviseddescent/blob/0179b4b69e3bc6e2bfdae51f3169414444290d63/include/superviseddescent/regressors.hpp#L264

is passed by copy, so whatever happen to this object has no effect on the next iteration. It's a bit dangerous though.

patrikhuber commented 5 years ago

Right! So the lambda = lambda * ... line is performed only on the copy of the Regulariser. That makes sense. I still think though that what I wrote is also likely correct, even if we didn't pass the regulariser by value there, the bug wouldn't happen because solve() is only ever called once per regressor.

In any case, you're completely right, the best would be to change the Regulariser to work on a local copy of lambda and then pass the Regulariser by const-ref :-)

Thank you for reporting this!