Closed Badr-MOUFAD closed 2 years ago
In homotopy we should avoid computing alpha_max when we can (it is costly) Instead we should just avoid rescaling if the point is already feasible.
If we fit with alpha > alpha_max, then the first dual pt is y, the dual norm is alpha_max < alpha and so we should not rescale it.
I do agree with your suggestion.
However, implementing it (mine also) will imply making big changes to the code as we don't always compute alpha_max
(depends on whether it was passed or not), and computing it varies according to the solved the problem.
Given these new insights, shall we perform these changes to fix this [non-harmful] bug?
In homotopy.py, L296 and L305, we compute the dual norm of X.T theta and automatically rescale by it. We should L305 rescale only if it is greater than alpha. Otherwise, the point is feasible, no need to rescale.
There is no extra cost and the code modification is minor
Merging #257 (580cc14) into main (75e3a4c) will increase coverage by
0.09%
. The diff coverage is100.00%
.:exclamation: Current head 580cc14 differs from pull request most recent head b3416a0. Consider uploading reports for the commit b3416a0 to get more accurate results
@@ Coverage Diff @@
## main #257 +/- ##
==========================================
+ Coverage 88.44% 88.53% +0.09%
==========================================
Files 15 15
Lines 1134 1143 +9
Branches 136 136
==========================================
+ Hits 1003 1012 +9
Misses 101 101
Partials 30 30
Flag | Coverage Δ | |
---|---|---|
unittests | ∅ <ø> (∅) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
celer/homotopy.py | 86.89% <100.00%> (ø) |
|
celer/tests/test_lasso.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 75e3a4c...b3416a0. Read the comment docs.
@mathurinm, I will let you check with reproduce.py
that this solves the bug.
closes #252
We get such behavior because of how we construct a dual feasible vector.
Indeed, when
alpha > alpha_max
(let sayalpha = m * alpha_max
wherem > 1
) , we start the solver attheta = m * y / n_samples
, wich is feasible yet not optimal....
This proceeds as follows: