First-order with momentum

d-monnet commented 9 months ago

Add the new algorithm fomo to JSOSolvers. First-Order Momentum model-based (FOMO) algorithm can be viewed as the Heavy-Ball algorithm embedded in a R2/first-order Trust-Region framework to ensure the convergence to a first-order critical point in the non-convex case. fomo computes steps in a gradient-related direction $d$ which includes the momentum contribution. The step is computed either in a R2 fashion: $s = -\alpha d$, or in a TR fashion $s = -\alpha d /||d||$, by choosing the associated backend. $\alpha$ is the inverse of the regularization parameter (usually denoted $\sigma$) in R2 context and the trust region radius in the TR context.

Tests and doc have been updated.

codecov[bot] commented 9 months ago

Codecov Report

Attention: Patch coverage is 87.50000% with 19 lines in your changes are missing coverage. Please review.

Project coverage is 88.77%. Comparing base (5c08163) to head (d5f409d). Report is 8 commits behind head on main.

Files	Patch %	Lines
src/fomo.jl	87.50%	19 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #250 +/- ## ========================================== + Coverage 88.48% 88.77% +0.28% ========================================== Files 7 7 Lines 1025 1096 +71 ========================================== + Hits 907 973 +66 - Misses 118 123 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 9 months ago

Package name	latest	stable
FletcherPenaltySolver.jl
Percival.jl

dpo commented 9 months ago

The code looks nice and clean, thank you. My main question is: if R2 is a special case of Fomo, I think we should

1) remove the current R2 2) add a new R2 “solver” that simply calls Fomo under the hood to help with the transition.

dpo commented 9 months ago

Also, please check the CI failures. In particular, allocation tests fail with Julia 1 and nightly on Linux and macOS: https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/actions/runs/7745561973/job/21121848905?pr=250#step:6:307

d-monnet commented 9 months ago

The code looks nice and clean, thank you. My main question is: if R2 is a special case of Fomo, I think we should
1. remove the current R2

2. add a new R2 “solver” that simply calls Fomo under the hood to help with the transition.

My only concern is that fomo allocates two more vector than R2, so it's not quite efficient in that respect. But yes, doing so would avoid having an extra implementation.

d-monnet commented 9 months ago

The code looks nice and clean, thank you. My main question is: if R2 is a special case of Fomo, I think we should
1. remove the current R2

2. add a new R2 “solver” that simply calls Fomo under the hood to help with the transition.

This is done with the last few commits. I made sure to keep the exact same R2 and R2Solver interfaces. However solve!() is not quite pretty now that it has to handle R2 specific case. @dpo I'll let you review the last modifications and share your thoughts. In particular, I avoid m and d memory allocations in FomoSolver when using R2 and I'd like to be sure it is implemented properly.

d-monnet commented 9 months ago

Also, please check the CI failures. In particular, allocation tests fail with Julia 1 and nightly on Linux and macOS: https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/actions/runs/7745561973/job/21121848905?pr=250#step:6:307

Looks like allocation test passes with the new R2 interface to FOMO backend (see last commits). The allocation problem can therefore be narrowed down to here https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/blob/85022e16a22b83c98abead21b23cc99b2e7fb959/src/fomo.jl#L313 and there https://github.com/JuliaSmoothOptimizers/JSOSolvers.jl/blob/85022e16a22b83c98abead21b23cc99b2e7fb959/src/fomo.jl#L327-L329 , but I don't see why allocation would occur here. I don't have a unix system, so it's not convenient for me to debug that.

dpo commented 9 months ago

The code looks nice and clean, thank you. My main question is: if R2 is a special case of Fomo, I think we should
1. remove the current R2

2. add a new R2 “solver” that simply calls Fomo under the hood to help with the transition.
My only concern is that fomo allocates two more vector than R2, so it's not quite efficient in that respect. But yes, doing so would avoid having an extra implementation.

You could have a constructor for an R2Solver that constructs a FOMOSolver but allocates those two vectors with length zero.

d-monnet commented 8 months ago

header is causing allocations and break related tests, I'll fix that rapidly. Also there are allocations (not due to header) occurring with Fomo, can't find where it comes from...

dpo commented 7 months ago

header is causing allocations and break related tests, I'll fix that rapidly. Also there are allocations (not due to header) occurring with Fomo, can't find where it comes from...

Have you tried https://docs.julialang.org/en/v1/manual/profile/#Line-by-Line-Allocation-Tracking ?

dpo commented 7 months ago

Thank you, @d-monnet !!!

JuliaSmoothOptimizers / JSOSolvers.jl

First-order with momentum #250

Codecov Report