mathurinm / andersoncd

This code is no longer maintained. The codebase has been moved to https://github.com/scikit-learn-contrib/skglm. This repository only serves to reproduce the results of the AISTATS 2021 paper "Anderson acceleration of coordinate descent" by Quentin Bertrand and Mathurin Massias.
BSD 3-Clause "New" or "Revised" License
18 stars 6 forks source link

ENH: use Datafit class #29

Closed mathurinm closed 3 years ago

mathurinm commented 3 years ago

I guess we should maintain Xw and not R (for logreg for example). It may require storing Xj @ y in the datafit class to avoid recomputing it at every gradient step, but I am not sure how to handle it for CV or more generally when new X and y's are passed

I don't know how much more costly it is to do Xj @ (y - Xw) than Xj @ R

QB3 commented 3 years ago

I don't know how much more costly it is to do Xj @ (y - Xw) than Xj @ R

I would bet on an additional O(n) per update of coordinate descent. IMO the 2 questions become:

codecov-commenter commented 3 years ago

Codecov Report

Merging #29 (ef00529) into master (b8ce7d3) will increase coverage by 7.10%. The diff coverage is 65.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #29      +/-   ##
==========================================
+ Coverage   56.19%   63.29%   +7.10%     
==========================================
  Files          12       11       -1     
  Lines        1098      613     -485     
  Branches      242      101     -141     
==========================================
- Hits          617      388     -229     
+ Misses        406      191     -215     
+ Partials       75       34      -41     
Impacted Files Coverage Δ
andersoncd/tests/test_docstring_parameters.py 73.91% <ø> (-0.73%) :arrow_down:
andersoncd/penalties.py 43.11% <43.11%> (ø)
andersoncd/solver.py 61.53% <61.53%> (ø)
andersoncd/datafits.py 64.70% <64.70%> (ø)
andersoncd/data/synthetic.py 72.41% <69.23%> (-18.50%) :arrow_down:
andersoncd/__init__.py 100.00% <100.00%> (ø)
andersoncd/data/__init__.py 100.00% <100.00%> (ø)
andersoncd/estimators.py 100.00% <100.00%> (ø)
andersoncd/tests/test_estimators.py 100.00% <100.00%> (ø)
andersoncd/utils.py 28.76% <0.00%> (-4.11%) :arrow_down:
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6174070...ef00529. Read the comment docs.

QB3 commented 3 years ago

WDYT of doing it properly and putting the n_samples in the lipschitz constant, in the gradient, etc ? IMO this will be easier for external contributors, and may solve ussome headaches without being more costly

seems better than what we currently have indeed. I propose to do it in an other PR