Klopfe commented 2 years ago

Tests for SVR are still pretty slow and this is probably due to the choice of stopping criterion based on tol pobj0 when pobj0 is the initial primal objective. However, we are solving in the dual and in the tests for example it leads to tol pobj0 to be smaller than 1e-18 and the dual gap never reaches this tolerance so it goes up to max iter.

I tried to lower the tol but then enet tests break.

All tests are clear though.

mathurinm commented 2 years ago

Thanks a lot @Klopfe ! I didn't read the diff for ssvr.py since it's quite large, do you think there is a way to factorize a bit with svr.py ?

When you say "I tried to lower the tol but then enet tests break." you mean to increase the tol ? Indeed 1e-15 is super low. Maybe we could increase it, and increase atol and rtol in the assertions ?

Klopfe commented 2 years ago

Thanks for looking at it. Yes, I think there would be ways to factorize. I will try to do that next.

Klopfe commented 2 years ago

It's a proposition but SSVR could be a child (in the object-oriented programming sense) of SVR so that there would be a decent amount of functions that would not need to be rewritten. What do you think?

codecov-commenter commented 2 years ago

Codecov Report

Merging #125 (185aad1) into master (27828ff) will increase coverage by 2.45%. The diff coverage is 89.78%.

@@            Coverage Diff             @@
##           master     #125      +/-   ##
==========================================
+ Coverage   79.43%   81.88%   +2.45%     
==========================================
  Files          41       42       +1     
  Lines        3029     3324     +295     
  Branches      318      349      +31     
==========================================
+ Hits         2406     2722     +316     
+ Misses        533      506      -27     
- Partials       90       96       +6

Flag	Coverage Δ
unittests	`∅ <ø> (∅)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
sparse_ho/algo/forward.py	`66.25% <ø> (-0.42%)`	:arrow_down:
sparse_ho/models/ssvr.py	`88.85% <88.85%> (ø)`
sparse_ho/models/__init__.py	`100.00% <100.00%> (ø)`
sparse_ho/models/svr.py	`92.94% <100.00%> (+12.30%)`	:arrow_up:
sparse_ho/tests/common.py	`100.00% <100.00%> (ø)`
sparse_ho/tests/cvxpylayer.py	`100.00% <100.00%> (+12.50%)`	:arrow_up:
sparse_ho/tests/test_models.py	`82.10% <100.00%> (+0.91%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 27828ff...185aad1. Read the comment docs.

mathurinm commented 2 years ago

Yes, that makes a lot of sense! I don't know how the two classes differ, otherwise you may want a BaseSVR class that both SVR and SimplexSVR inherit from. As you see fit!

Klopfe commented 2 years ago

yes I will see what's best. Thanks!

mathurinm commented 2 years ago

On master tests run in 50 s, on this PR they take 134 s, I think we cannot merge as is. Do you have the same @Klopfe ?

This branch:

================== 1 failed, 73 passed, 10 xfailed, 748 warnings in 134.08s (0:02:14) ==================

========================================= slowest 10 durations =========================================
46.59s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo1-svr-MSE]
28.29s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo1-ssvr-MSE]
12.17s call     sparse_ho/tests/test_models.py::test_beta_jac[ssvr]
5.88s call     sparse_ho/tests/test_models.py::test_beta_jac[svr]
3.85s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo1-enet-MSE]
3.62s call     sparse_ho/tests/test_models.py::test_beta_jac[enet]
3.42s call     sparse_ho/tests/test_models.py::test_beta_jac[wLasso]
3.16s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo1-svm-MSE]
2.80s call     sparse_ho/tests/test_models.py::test_beta_jac[logreg]
1.95s call     sparse_ho/tests/test_criterion.py::test_cross_val_criterion[wLasso-criterion2]

master:

=========== 1 failed, 60 passed, 15 xfailed, 515 warnings in 50.92s ============

Klopfe commented 2 years ago

Yeah, it runs in 100 seconds here. So it doubles the time. I will try to find a fix. I can play with the regularization hyperparameter and maybe lower the tol.

mathurinm commented 2 years ago

but is there a fundamental reason for it to much slower than test for existing estimators ?

Klopfe commented 2 years ago

well, the optimization problem is two times bigger than the svm ones. But it shouldn't be that long for our tests.

mathurinm commented 2 years ago

It may be interesting to isolate the slow test and run it in max verbosity mode to see visually what takes time

Klopfe commented 2 years ago

yes, I think I caught a bigger problem than just svr. Hard to explain typing though... It's related to warm start in test and the fact that the Jacobian does not need to be recomputed sometimes. When that happens the stopping criterion to stop the Jacobian calculation is never stopped and goes to max_iter_jac.

I push a temporary commit print the problem. To see it just execute test_models in ipython. Here its for the Lasso (to show that it's not only SVR related)

I commented lines 154-155 of implicit_forward.py a temporary fix but we might need a bigger fix in the warm start section.

mathurinm commented 2 years ago

nice catch ! since it's unrelated to SVR, any idea why this did not pop up before ?

Is it ok for you to tackle that in another PR first ? From a design point of view I don't understand why we enter into the recomputation of the Jacobian if we can use the old one, what am I missing ?

Klopfe commented 2 years ago

yes I can tackle that in another PR. I have to understand why we rerun two times the same optimization problem with the same data and parameters and maybe add a check somewhere.

I will clean the files for the test and try to work on the factorization in the next few days on this PR.

mathurinm commented 2 years ago

just to be clear, I think it would be better to address the warm start issue before finishing this PR, so that we don't merge tests that are too long. My 2 cents, your call in the end

Klopfe commented 2 years ago

agreed

mathurinm commented 2 years ago

After merging master,I still have some slow tests for ssvr, with algo0 and algo2 . Could it be related to the same bug for other algos ?

================================================== slowest 20 durations ==================================================
41.60s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo0-ssvr-MSE]
23.79s call     sparse_ho/tests/test_models.py::test_check_grad_sparse_ho[algo2-ssvr-MSE]
6.67s call     sparse_ho/tests/test_models.py::test_beta_jac[ssvr]
5.15s call     sparse_ho/tests/test_models.py::test_beta_jac[svr]
3.31s call     sparse_ho/tests/test_models.py::test_beta_jac[wLasso]
2.80s call     sparse_ho/tests/test_models.py::test_beta_jac[enet]

Klopfe commented 2 years ago

I will look into it.

Klopfe commented 2 years ago

@mathurinm can you check that the tests run fine now for you ?

mathurinm commented 2 years ago

40 s on my machine :1st_place_medal:

I don't know why, two unrelated tests are very slow on github actions.

Klopfe commented 2 years ago

Yes I saw that too. More than 30second each. I don't know what affects these two tests. It is with the line-search

QB3 / sparse-ho

[MRG] Addition of Simplex SVR model and little fix on SVR #125

Codecov Report