Mads.levenberg_marquardt not finding optimum, even for Rosenbrock example, could it be that different settings are needed?

donboyd5 commented 2 years ago

Hi. Thank you for this amazing software.

This post is background information; the next post has reproducible code and results.

I am hoping you might advise on settings or other adjustments to resolve the problem I describe below.

I am not an optimization expert. I am seeking to minimize, in Julia, the sum of squared residuals for a relatively large problem (1,000 residuals based on 1,000 parameters plus external data). Every residual depends on every parameter (I guess this means the jacobian is completely dense). It is not practical to compute the jacobian analytically but autodifferentiation works well. LsqFit.lmfit works reasonably well on this problem but is slow. After extensive search, I found Mads. Surprisingly, there doesn't seem to be much mention of it in various Julia discussions.

As far as I can tell, Mads installed properly (latest release, and current master) although there was one seemingly small warning when I first imported Mads and one error when running Mads.test() - I'll copy the messages into the 2nd post below.

I found Mads.levenberg_marquardt to be amazingly fast on test versions of my problem but when I used my more challenging actual problems I noticed that it was not finding the optimum value that LsqFit.lmfit was finding.

I adjusted many parameters but found that it was almost always stopping after 3 iterations no matter what I did.

Then I reran Mads.levenberg_marquardt successively with starting values from the prior run and found that the objective function improved with each successive run, but it still would run for only 3 iterations each time.

After much head-scratching, I tried the same thing with a Rosenbrock example, and found it to happen with that, too - see next post.

donboyd5 commented 2 years ago

Minimal reproducible example:

Setup -- I could not get show_trace to work so I used a callback function:

import Mads

function callback(x_best::AbstractVector, of::Number, lambda::Number)
  global callbacksucceeded
  callbacksucceeded = true
  # println("The callback function was called: $x_best, $of, $lambda")
  println(of, " ", lambda)
end

ndim = 200

Here are the results of the initial run with arguments from an online example (plus callback). It stopped after 5 iterations, finding a minimum of 194.209:

results = Mads.levenberg_marquardt(Mads.makerosenbrock(ndim), Mads.makerosenbrock_gradient(ndim), zeros(ndim),
  lambda_mu=2.0, np_lambda=10, show_trace=true, maxJacobians=1000, callbackiteration=callback)

199.0 NaN
198.23585672156162 1.616
196.02336056127456 3.232
195.53487943980062 3.232
194.87083365351668 1.616
194.20908763543298 1.616
OptimBase.MultivariateOptimizationResults{LsqFit.LevenbergMarquardt, Float64, 1}(LsqFit.LevenbergMarquardt(), [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.8354459560429608, 0.6914560478488004, 0.45997878568603756, 0.18070529280425973, 0.026133524627512415, 0.010552791398802898, 0.010215197855675854, 0.010208551538302487, 0.010208420950213442, 0.010208418384502472  …  0.010208418333082964, 0.010208418333082645, 0.01020841833306728, 0.010208418332321646, 0.010208418296146864, 0.010208416541523314, 0.01020833145115144, 0.01020420555056359, 0.010004084362008978, 0.00010008148412392112], 194.20908763543298, 5, true, false, 0.0001, 0.0, false, 0.001, 0.0, false, 1.0e-6, 0.0, false, Iter     Function value   Gradient norm
------   --------------   --------------
, 1051, 5, 0)

Next, rerunning with initial values set to the minimizer above brings the found optimum down to 191.6:

results = Mads.levenberg_marquardt(Mads.makerosenbrock(ndim), Mads.makerosenbrock_gradient(ndim), results.minimizer,
         lambda_mu=2.0, np_lambda=10, show_trace=false, maxJacobians=1000, callbackiteration=callback)
194.20908763543298 NaN
193.59386062313567 2.3379566915413843
193.08554483140267 2.3379566915413843
192.58941336193575 2.3379566915413843
192.09289137376737 2.3379566915413843
191.5973596471983 2.3379566915413843
OptimBase.MultivariateOptimizationResults{LsqFit.LevenbergMarquardt, Float64, 1}(LsqFit.LevenbergMarquardt(), [0.8354459560429608, 0.6914560478488004, 0.45997878568603756, 0.18070529280425973, 0.026133524627512415, 0.010552791398802898, 0.010215197855675854, 0.010208551538302487, 0.010208420950213442, 0.010208418384502472  …  0.010208418333082964, 0.010208418333082645, 0.01020841833306728, 0.010208418332321646, 0.010208418296146864, 0.010208416541523314, 0.01020833145115144, 0.01020420555056359, 0.010004084362008978, 0.00010008148412392112], [0.9672626705833888, 0.9355746234725513, 0.875032976171471, 0.7642386844976807, 0.5789582894082582, 0.32372635302649344, 0.09542746097178234, 0.015550140848797377, 0.010319742254332553, 0.01021059600578051  …  0.010208423834357183, 0.010208423834356866, 0.010208423834341458, 0.010208423833594406, 0.010208423797378192, 0.010208422041654367, 0.010208336925996644, 0.010204210572704937, 0.010004085062783958, 0.00010008171794335949], 191.5973596471983, 5, true, false, 0.0001, 0.0, false, 0.001, 0.0, false, 1.0e-6, 0.0, false, Iter     Function value   Gradient norm
------   --------------   --------------
, 1051, 5, 0)

Additional runs using most-recent minimizer show continued improvement.

Changing tolerances does not seem to lead to success in a single run, but I am not sure I know how to set tolerances properly:

results = Mads.levenberg_marquardt(Mads.makerosenbrock(ndim), Mads.makerosenbrock_gradient(ndim), zeros(ndim),
         tolOF=1e-99, tolX=1e-99, tolG=1e-99,
         lambda_mu=2.0, np_lambda=10, show_trace=false, maxJacobians=1000, callbackiteration=callback)
199.0 NaN
198.23585672156162 1.616
196.02336056127456 3.232
195.53487943980062 3.232
194.87083365351668 1.616
194.20908763543298 1.616
OptimBase.MultivariateOptimizationResults{LsqFit.LevenbergMarquardt, Float64, 1}(LsqFit.LevenbergMarquardt(), [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.8354459560429608, 0.6914560478488004, 0.45997878568603756, 0.18070529280425973, 0.026133524627512415, 0.010552791398802898, 0.010215197855675854, 0.010208551538302487, 0.010208420950213442, 0.010208418384502472  …  0.010208418333082964, 0.010208418333082645, 0.01020841833306728, 0.010208418332321646, 0.010208418296146864, 0.010208416541523314, 0.01020833145115144, 0.01020420555056359, 0.010004084362008978, 0.00010008148412392112], 194.20908763543298, 5, true, false, 1.0e-99, 0.0, false, 1.0e-99, 0.0, false, 1.0e-99, 0.0, false, Iter     Function value   Gradient norm
------   --------------   --------------
, 1051, 5, 0)

I apologize if I am missing something simple. I would much appreciate any advice on settings or other adjustments that would lead Mads.levenberg_marquardt to achieve success in single run.

Thanks!

donboyd5 commented 2 years ago

Here is the warning I saw when I first imported Mads:

(process:45304): GLib-GIO-WARNING **: 08:14:53.323: Unexpectedly, UWP app `Clipchamp.Clipchamp_2.3.2.0_neutral__yxz26nhyzhsrt' (AUMId `Clipchamp.Clipchamp_yxz26nhyzhsrt!App') supports 46 extensions but has no verbs

Here is the error I saw when running Mads.test(). It seems to be related to plotting and not a source of the problem I am running into, but I am not sure.

# ERROR: LoadError: MethodError: no method matching layer(::Vector{Gadfly.Geom.LineGeometry}, ::Vector{Gadfly.Geom.LineGeometry}, ::Vector{Gadfly.Geom.LineGeometry}, ::Vector{Gadfly.Geom.LineGeometry}, ::Vector{Gadfly.Geom.LineGeometry}; x=1:4, y=[0.9724953537852915, 0.955955284443903, 0.9008697445870358, 0.6321582462815516])
# Closest candidates are:
#   layer(::Any, ::Union{Function, Gadfly.Element, Gadfly.Theme, Type}...; mapping...) at C:\Users\donbo\.julia\packages\Gadfly\B5yQc\src\Gadfly.jl:169
# Stacktrace:
#  [1] plotseries(X::Matrix{Float64}, filename::String; nT::Int64, nS::Int64, format::String, xtitle::String, ytitle::String, title::String, logx::Bool, logy::Bool, keytitle::String, name::String, names::Vector{String}, combined::Bool, hsize::Measures.AbsoluteLength, vsize::Measures.AbsoluteLength, linewidth::Measures.AbsoluteLength, linestyle::Symbol, pointsize::Measures.AbsoluteLength, key_position::Symbol, major_label_font_size::Measures.AbsoluteLength, minor_label_font_size::Measures.AbsoluteLength, dpi::Int64, colors::Vector{String}, opacity::Float64, xmin::Nothing, xmax::Nothing, ymin::Nothing, ymax::Nothing, xaxis::UnitRange{Int64}, plotline::Bool, plotdots::Bool, firstred::Bool, lastred::Bool, nextgray::Bool, code::Bool,
#  returnplot::Bool, colorkey::Bool, background_color::Nothing, gm::Vector{Any}, gl::Vector{Any}, quiet::Bool, truth::Bool)
#    @ Mads C:\Users\donbo\.julia\packages\Mads\ZVE7t\src\MadsPlot.jl:1213
#  [2] top-level scope
#    @ C:\Users\donbo\.julia\packages\Mads\ZVE7t\test\miscellaneous.jl:174
# in expression starting at C:\Users\donbo\.julia\packages\Mads\ZVE7t\test\miscellaneous.jl:174
# in expression starting at C:\Users\donbo\.julia\packages\Mads\ZVE7t\test\runtests.jl:35

montyvesselinov commented 2 years ago

The new push resolves the plotting error above.

montyvesselinov commented 2 years ago

@donboyd5

in the examples above, the optimization was terminated early due to impose a limit on the number of evaluations. Try this:

results = Mads.levenberg_marquardt(Mads.makerosenbrock(ndim), Mads.makerosenbrock_gradient(ndim), results.minimizer, lambda_mu=0.1, np_lambda=10, show_trace=true, maxJacobians=10000, callbackiteration=callback, maxEval=1000000)

donboyd5 commented 2 years ago

Thanks! I don't know how I missed those arguments. It ran for 290 iterations and found a minimum (within tolerances). I am eager to try this out on some challenging problems I need to solve.

Meanwhile, the new version I used (github master -- v1.3.2) failed the blind_source_separation tests. I've copied the relevant terminal output into the next comment, FYI.

donboyd5 commented 2 years ago

blind_source_separation ... BSS: Test Failed at C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:45 Expression: isapprox(Wipopt, good_Wipopt, atol = 1.0e-5) Evaluated: isapprox([0.4393811750993777 0.556324761585671 0.7011108759646614; 0.23210432591193064 0.35703792256259287 0.46318667455609314; … ; 0.6763923954037356 0.6597912926962526 0.5757394455566162; 0.3336098672555269 0.3671513321365153 0.6386130903198016], [0.6927270175648665 0.42688957649404896 0.5411914709579841; 0.4576407619321275 0.22606307639646217 0.3473439994366732; … ; 0.5688328985098947 0.6555506722606432 0.6416560053489268; 0.6310213694286327 0.32352963221036524 0.35726194082003304]; atol = 1.0e-5) Stacktrace: [1] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:445 [inlined] [2] macro expansion @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:45 [inlined] [3] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:1283 [inlined] [4] top-level scope @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:27 BSS: Test Failed at C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:71 Expression: isapprox(Wipopt, good_Wipopt, atol = 1.0e-5) Evaluated: isapprox([0.47341878197136333 0.3647322029115461 0.4377373855326509; 0.571852347219909 0.3822368347856248 0.5072889405810285; … ; 0.0033240529788587437 0.010270657295018672 0.6622879955944362; 0.0038231950810760085 0.014793810787818712 0.6985453378866647], [0.46870591384088583 0.4418162842563336 0.4998118753916622; 0.5443027431847592 0.4968362520623842 0.5966951579836749; … ; 0.00271195483962373 0.47212461252579485 0.2432066865260844; 0.004575797544812914 0.49944036698605243 0.25670031432349705]; atol = 1.0e-5) Stacktrace: [1] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:445 [inlined] [2] macro expansion @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:71 [inlined] [3] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:1283 [inlined] [4] top-level scope @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:27 BSS: Test Failed at C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:98 Expression: isapprox(Wipopt, good_Wipopt, atol = 1.0e-5) Evaluated: isapprox([0.4002766100011835 0.3681302782730627 0.47359940587185007; 0.18526497234747402 0.38527309162215556 0.5726440759846251; … ; 0.5189993112136911 0.012603483733826693 0.001956159750596972; 0.48269960829583425 0.016852327584298617 0.0026586976228175664], [0.43997732589169625 0.4778338600431116 0.49067297440103214; 0.29176155814319393 0.5568016168827166 0.490592574699238; … ; 0.3778958008018817 0.0029088364181328484 0.1812466131767778; 0.3536692095182664 0.005267867477008295 0.16912751739076906]; atol = 1.0e-5) Stacktrace: [1] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:445 [inlined] [2] macro expansion @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:98 [inlined] [3] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:1283 [inlined] [4] top-level scope @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:27 BSS: Test Failed at C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:127 Expression: isapprox(Wipopt, good_Wipopt, atol = 1.0e-5) Evaluated: isapprox([0.4692010863724582 0.007049488315506807 0.4000063001429282; 0.567263746464563 0.00814134420236934 0.18846167063779934; … ; 0.0027835045489166942 0.0006995775852756549 0.5110359254997686; 0.0033833964201726383 0.0006618864116166145 0.4753938977481036], [0.46861529626646903 0.007028966007031237 0.39987600304538096; 0.5665547702659354 0.008116881122739453 0.18840133136045392; … ; 0.0027813649632309574 0.0006986651446466238 0.510867738883522; 0.0033804221246717075 0.000660974692974541 0.4752374535435066]; atol = 1.0e-5) Stacktrace: [1] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:445 [inlined] [2] macro expansion @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:127 [inlined] [3] macro expansion @ C:\Users\donbo\AppData\Local\Programs\julia-1.7.3\share\julia\stdlib\v1.7\Test\src\Test.jl:1283 [inlined] [4] top-level scope @ C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:27 Test Summary: | Pass Fail Total BSS | 3 4 7 ERROR: LoadError: Some tests did not pass: 3 passed, 4 failed, 0 errored, 0 broken. in expression starting at C:\Users\donbo.julia\packages\Mads\0mcgW\examples\blind_source_separation\runtests.jl:16 in expression starting at C:\Users\donbo.julia\packages\Mads\0mcgW\test\runtests.jl:54

montyvesselinov commented 10 months ago

now these issues are fixed.

madsjulia / Mads.jl

Mads.levenberg_marquardt not finding optimum, even for Rosenbrock example, could it be that different settings are needed? #51