Closed mathurinm closed 2 years ago
Merging #63 (ea11649) into master (b8ce7d3) will increase coverage by
8.90%
. The diff coverage is67.84%
.
@@ Coverage Diff @@
## master #63 +/- ##
==========================================
+ Coverage 56.19% 65.09% +8.90%
==========================================
Files 12 11 -1
Lines 1098 742 -356
Branches 242 117 -125
==========================================
- Hits 617 483 -134
+ Misses 406 228 -178
+ Partials 75 31 -44
Impacted Files | Coverage Δ | |
---|---|---|
andersoncd/tests/test_docstring_parameters.py | 73.91% <ø> (-0.73%) |
:arrow_down: |
andersoncd/penalties.py | 44.76% <44.76%> (ø) |
|
andersoncd/datafits.py | 52.23% <52.23%> (ø) |
|
andersoncd/solver.py | 66.44% <66.44%> (ø) |
|
andersoncd/data/synthetic.py | 72.41% <69.23%> (-18.50%) |
:arrow_down: |
andersoncd/estimators.py | 92.39% <92.39%> (ø) |
|
andersoncd/__init__.py | 100.00% <100.00%> (ø) |
|
andersoncd/data/__init__.py | 100.00% <100.00%> (ø) |
|
andersoncd/tests/test_estimators.py | 100.00% <100.00%> (ø) |
|
andersoncd/utils.py | 28.76% <0.00%> (-4.11%) |
:arrow_down: |
... and 1 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 1d83596...ea11649. Read the comment docs.
Investigated a bit further : at the iteration where the solver fails, kkt has only 82 non zero values (I'm surprised)
by growth policy it turns out that we select 114 feats in subpb.
. We are unlucky and amongst the 114 - 86 features with 0 kkt violation that we pick, there is one which is a 0 column, hence the failure.
It's wild that so many features have 0 kkt violation, no ? I see the easy fix of selecting at most (kkt != 0).sum() features in the ws, but I'm surprised it's 0 for such a large number of features
Your take on this @QB3 ?
I'm surprised it's 0 for such a large number of features
I also already observed that uncleaned rcv1 has a large number of zero columns.
+1 for (kkt != 0).sum()
Something weird is happening: without this, the solver fails when X has a 0 column (which is normal, there is a division by 0)
But such a column should not be selected in the WS, right ?
Reproduce with