rust-cv / levenberg-marquardt

Provides abstractions to run Levenberg-Marquardt optimization
MIT License
48 stars 15 forks source link

Configuration questions #14

Open ChristopherRabotin opened 2 years ago

ChristopherRabotin commented 2 years ago

Hi there,

First of all, thanks for implementing this algorithm. I look forward to being able to use it correctly.

In my application, I'm currently using a Gauss-Newton (sometimes called Newton-Raphson) algorithm to solve for a single-shooting problem. A good description of the problem is provided at the start of this video: https://www.youtube.com/watch?v=4qFlaqCsnQA . The mathematical specifications of my implementation is available here: https://nyxspace.com/MathSpec/optimization/targeting/ .

I'm trying to switch from Gauss-Newton to Levenberg, using your library. Although in theory Gauss-Newton is not great with a bad initial guess, in my application it works decently well. Levenberg Marquardt is theoretically more resilient to bad initial guesses. However, your LM library algorithm is always taking tiny steps to guess what the correct solution (\vec{x}) could be. So, I suspect I'm incorrectly configuring it.

It converges on the correct solution in 8 evaluations if I provide it with the exact solution of the Gauss Newton algorithm. Otherwise, it'll try incredibly small changes in the solutions and assume that it has minimized the function... but the residuals are just as large as when it started the problem (and nowhere near zero).

I've provided a detailed example below, but my question straightforward: Is there a way to tell the LM structure to only converge when the residuals are zero and ignore ftol and xtol entirely?

Thanks

Example

Gauss Newton solution

The correct solution that minimizes the problem is the following Vector3<f64>:

  ┌                     ┐
  │  0.6887498158650465 │
  │ -1.9563017351074758 │
  │  -0.560405774185829 │
  └                     ┘

Full log of the solution:

 INFO  nyx_space::md::opti::raphson_finite_diff > Targeter -- CONVERGED in 12 iterations
 INFO  nyx_space::md::opti::raphson_finite_diff >       SMA: achieved = 8100.00000       desired = 8100.00000    scaled error =    0.00000
 INFO  nyx_space::md::opti::raphson_finite_diff >       Eccentricity: achieved =    0.40000      desired =    0.40000    scaled error =   -0.00000
 INFO  nyx_space::md::opti::raphson_finite_diff >       Inclination: achieved =   28.52611       desired =   28.52611    scaled error =    0.00000
FD: Targeter solution correcting ["VelocityX", "VelocityY", "VelocityZ"] (converged in 0.032 seconds, 12 iterations):
        Correction @ 2020-01-01T00:00:00 UTC:
                VelocityX = 0.689 m/s
                VelocityY = -1.956 m/s
                VelocityZ = -0.560 m/s
                |Δv| = 2148.383 m/s
        Achieved @ 2020-01-01T00:59:20.540817260 UTC:
                SMA = 8100.000 (wanted 8100.000 ± 1.0e-3)
                Eccentricity = 0.400 (wanted 0.400 ± 1.0e-5)
                Inclination = 28.526 (wanted 28.526 ± 1.0e-1)
        Corrected state:
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:00:00 UTC     position = [3835.382907, -7756.921938, -4156.921938] km velocity = [5.345645, 1.118432, -2.001254] km/s
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:00:00 UTC     sma = 8100.000000 km    ecc = 0.400000  inc = 28.526111 deg     raan = 54.206044 deg    aop = 108.326570 deg    ta = 136.729436 deg
        Achieved state:
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:59:20.540817260 UTC   position = [7266.068236, 5249.287121, -1534.719097] km  velocity = [-4.534582, 3.021172, 2.959670] km/s
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:59:20.540817260 UTC   sma = 8100.000000 km    ecc = 0.400000  inc = 28.526111 deg     raan = 54.206044 deg    aop = 108.326570 deg    ta = 230.979694 deg

Initial params at zero

If the initial params is zero, then it'll "converge" claiming that the ftol solution: MinimizationReport { termination: Converged { ftol: true, xtol: false }, number_of_evaluations: 26, objective_function: 5001.106174351325 } .

First attempted control (the "achieved, desired, error" message shows the actual targets of the problem):

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 7903.45074       desired = 8100.00000    scaled error =  196.54926
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.21490      desired =    0.40000    scaled error =    0.18510
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.44637       desired =   28.52611    scaled error =   -1.92026
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                       ┐
  │ -0.007939256315236743 │
  │  -0.01202376661064193 │
  │  0.025652932101011633 │
  └                       ┘

And 22nd attempt: this shows that we're back at the initial error in residuals.

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 8000.00000       desired = 8100.00000    scaled error =  100.00000
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.20000      desired =    0.40000    scaled error =    0.20000
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.00000       desired =   28.52611    scaled error =   -1.47389
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                                     ┐
  │ -0.00000000000000003901171501256353 │
  │ -0.00000000000000005908364492440542 │
  │   0.0000000000000001260660433615449 │
  └                                     ┘

 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): -0.00000000000000003901171501256353
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): -0.00000000000000005908364492440542
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 0.0000000000000001260660433615449

Final attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 8000.00000       desired = 8100.00000    scaled error =  100.00000
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.20000      desired =    0.40000    scaled error =    0.20000
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.00000       desired =   28.52611    scaled error =   -1.47389
[src/md/opti/minimize_lm.rs:634] converged = false
MinimizationReport { termination: Converged { ftol: true, xtol: false }, number_of_evaluations: 26, objective_function: 5001.106174351325 }
Result correction: 
  ┌   ┐
  │ 0 │
  │ 0 │
  │ 0 │
  └   ┘

                0 km/s

Initial params set to some random but wrong solution

24 evaluations, no meaningful progress. Initial parameters set to [1.5, 1.5 ,1.5].

First attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = -14499.72096     desired = 8100.00000    scaled error = 22599.72096
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    1.57039      desired =    0.40000    scaled error =   -1.17039
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   27.34254       desired =   28.52611    scaled error =    1.18357
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                    ┐
  │ 2.1167838430947192 │
  │ 1.9808284453966405 │
  │  4.069698058817354 │
  └                    ┘

 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): 2.1167838430947192
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): 1.9808284453966405
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 4.069698058817354

Some attempt in the middle:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 16469.24599      desired = 8100.00000    scaled error = -8369.24599
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.44310      desired =    0.40000    scaled error =   -0.04310
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   25.96224       desired =   28.52611    scaled error =    2.56387
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                    ┐
  │ 1.5000000000005078 │
  │ 1.5000000000006835 │
  │  1.500000000052811 │
  └                    ┘

 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): 1.5000000000005078
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): 1.5000000000006835
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 1.500000000052811

Last attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 16469.24599      desired = 8100.00000    scaled error = -8369.24599
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.44310      desired =    0.40000    scaled error =   -0.04310
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   25.96224       desired =   28.52611    scaled error =    2.56387
[src/md/opti/minimize_lm.rs:634] converged = false
MinimizationReport { termination: Converged { ftol: false, xtol: true }, number_of_evaluations: 24, objective_function: 35022142.535638005 }
Result correction: 
  ┌     ┐
  │ 1.5 │
  │ 1.5 │
  │ 1.5 │
  └     ┘

                2.598076211353316 km/s
vadixidav commented 2 years ago

@jannschu do you think you could give some advice on this?

My naive guess is that perhaps the trust region is too small and it is choosing to bias more towards gradient descent than Gauss-Newton. I haven't been using this library recently, but my guess would be to increase the trust region size. When the damping factor is zero, Leverberg-Marquardt behaves like Gauss-Newton, and when it is high it behaves like gradient descent. My guess is that the damping factor is not getting set low enough at some point, but I don't know how to configure the logic for converging and backing off the damping factor. I defer to @jannschu, as they know much more about how the current implementation works (they ported it from MINPACK).

jannschu commented 2 years ago

Is this problem a 3D version of the beacon finding problem shown in the video? Is the source code already somewhere?

The tiny step sizes might indicate a wrong Jacobian. How do you compute it? If your problem is reasonably fast to to port to Python or Matlab you could check if the reference implementation from MINPACK gives different results. Should that be the case I would consider this a bug.

If given a correct solution as initial parameters the algorithm should terminate immediately since the gradient should be zero.

To answer your question regarding ignoring ftol and gtol: You may set them to zero, but the algorithm will still terminate if the predicted or actual reduction is below the the machine precision.

ChristopherRabotin commented 2 years ago

I don't know what is the beacon searching problem. The problem I'm solving is orbital targeting for spacecraft maneuver design.

For the time being, the gradient is computed by finite differencing. I know it to be correct because the same gradient is computed for the Newton Raphson method, and there are at least a dozen validation scenarios of that implementation compared to a leading NASA software that solves the same problem.

On Mon, Jan 3, 2022, 09:15 jannschu @.***> wrote:

Is this problem a 3D version of the beacon finding problem shown in the video? Is the source code already somewhere?

The tiny step sizes might indicate a wrong Jacobian. How do you compute it? If your problem is reasonably fast to to port to Python or Matlab you could check if the reference implementation from MINPACK gives different results. Should that be the case I would consider this a bug.

If given a correct solution as initial parameters the algorithm should terminate immediately since the gradient should be zero.

To answer your question regarding ignoring ftol and gtol: You may set them to zero, but the algorithm will still terminate if the predicted or actual reduction is below the the machine precision.

— Reply to this email directly, view it on GitHub https://github.com/rust-cv/levenberg-marquardt/issues/14#issuecomment-1003928225, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEZV2AYOYUHVABFYGKZCRTUUFLIHANCNFSM5KMFXILA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

jannschu commented 2 years ago

You said a good description of the problem you are solving is given at the start of the linked YouTube video. There I can only find a problem where the location of a transmitter shall be found through distances to beacons. The other page only seems to contain a high level list of things you want to control.

I also looked in the nyx_space repository, but I could not find the code you are working on.

In order to proceed I would suggest to provide a minimal working example and if possible a comparison with MINPACK. A description of the least-squares problem would also be helpful.

It converges on the correct solution in 8 evaluations if I provide it with the exact solution of the Gauss Newton algorithm.

This is, as I said, quite suspicious. Other basic things to check would be that the gradient for the correct x is computed, that is in the right form (matches residuals), and that it was not multiplied with -1. More generally, your optimization problem must be correctly mapped to the form this library expects. But it seems you already spent quite some time on that.

It also seems you want to find a root. Modeling this as a least squares problem might give you local minima.

Since there is no description of the least-squares description of the problem you want to solve (unless I spent hours digging into orbital targeting, which is new to me) , there is no minimal code for me to work with, there is no comparison with other LM implementations where it does work, and your issue only contains examples of iterations which are meaningless to me, and that I answered the question on xtol and ftol it seems that was the best we can do here. Sorry, I feel your frustrations :-(

ChristopherRabotin commented 2 years ago

Ah, yes, now I see. Sorry, I'm still on vacation and my brain isn't working, so I had forgotten that I added a link to that video. But yes, the problem is analogous in theory. I try to find a solution to my control where the targets are met: I'm formulating that as a root finding problem. The residuals will go to zero when the control is approaching the solution. Is there another way to formulate this problem without manually rewriting Lagrange multipliers and such for every new input problem?

The code currently has a whole new implementation to integrate with your library but I've been wanting to consolidate it with the Newton Raphson implementation since the residuals and Jacobian are identical. The LM version is available here: https://gitlab.com/nyx-space/nyx/-/blob/128-finite-burn-targeting/src/md/opti/minimize_lm.rs (128-finite-burn-targeting branch).

I'll work on this probably next week, and that will help me write a simple reproducible example.

I must say that your api is great: using a trait is by far the cleanest method and I wish many other libs did the same (instead of assuming the dimensions of the function to minimize).

On Tue, Jan 4, 2022, 08:47 jannschu @.***> wrote:

You said a good description of the problem you are solving is given at the start of the linked YouTube video. There I can only find a problem where the location of a transmitter shall be found through distances to beacons. The other page only seems to contain a high level list of things you want to control.

I also looked in the nyx_space repository, but I could not find the code you are working on.

In order to proceed I would suggest to provide a minimal working example and if possible a comparison with MINPACK. A description of the least-squares problem would also be helpful.

It converges on the correct solution in 8 evaluations if I provide it with the exact solution of the Gauss Newton algorithm.

This is, as I said, quite suspicious. Other basic things to check would be that the gradient for the correct x is computed, that is in the right form (matches residuals), and that it was not multiplied with -1. More generally, your optimization problem must be correctly mapped to the form this library expects. But it seems you already spent quite some time on that.

It also seems you want to find a root. Modeling this as a least squares problem might give you local minima.

Since there is no description of the least-squares description of the problem you want to solve (unless I spent hours digging into orbital targeting, which is new to me) , there is no minimal code for me to work with, there is no comparison with other LM implementations where it does work, and your issue only contains examples of iterations which are meaningless to me, and that I answered the question on xtol and ftol it seems that was the best we can do here. Sorry, I feel your frustrations :-(

— Reply to this email directly, view it on GitHub https://github.com/rust-cv/levenberg-marquardt/issues/14#issuecomment-1004589651, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEZV2E5SCRWPM3FZRH4JPLUUKQXRANCNFSM5KMFXILA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>