Closed mathurinm closed 3 years ago
I don't know how much more costly it is to do Xj @ (y - Xw) than Xj @ R
I would bet on an additional O(n) per update of coordinate descent. IMO the 2 questions become:
Merging #29 (ef00529) into master (b8ce7d3) will increase coverage by
7.10%
. The diff coverage is65.72%
.
@@ Coverage Diff @@
## master #29 +/- ##
==========================================
+ Coverage 56.19% 63.29% +7.10%
==========================================
Files 12 11 -1
Lines 1098 613 -485
Branches 242 101 -141
==========================================
- Hits 617 388 -229
+ Misses 406 191 -215
+ Partials 75 34 -41
Impacted Files | Coverage Δ | |
---|---|---|
andersoncd/tests/test_docstring_parameters.py | 73.91% <ø> (-0.73%) |
:arrow_down: |
andersoncd/penalties.py | 43.11% <43.11%> (ø) |
|
andersoncd/solver.py | 61.53% <61.53%> (ø) |
|
andersoncd/datafits.py | 64.70% <64.70%> (ø) |
|
andersoncd/data/synthetic.py | 72.41% <69.23%> (-18.50%) |
:arrow_down: |
andersoncd/__init__.py | 100.00% <100.00%> (ø) |
|
andersoncd/data/__init__.py | 100.00% <100.00%> (ø) |
|
andersoncd/estimators.py | 100.00% <100.00%> (ø) |
|
andersoncd/tests/test_estimators.py | 100.00% <100.00%> (ø) |
|
andersoncd/utils.py | 28.76% <0.00%> (-4.11%) |
:arrow_down: |
... and 7 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 6174070...ef00529. Read the comment docs.
WDYT of doing it properly and putting the n_samples in the lipschitz constant, in the gradient, etc ? IMO this will be easier for external contributors, and may solve ussome headaches without being more costly
seems better than what we currently have indeed. I propose to do it in an other PR
I guess we should maintain Xw and not R (for logreg for example). It may require storing Xj @ y in the datafit class to avoid recomputing it at every gradient step, but I am not sure how to handle it for CV or more generally when new X and y's are passed
I don't know how much more costly it is to do Xj @ (y - Xw) than Xj @ R