Wrong solution return on :lm and :lmdif method (res.x returns solution multiplied by sqrt(2))

jibaneza commented 3 years ago

Hi,

I am using Windows 10, Julia 1.5.3 and MINPACK v1.1.1. If I run the following code:

using MINPACK
function f!(fvec, x)
    println(x)
    fvec[1] = x[1] + 3.0
    fvec[2] = x[2] - 1.0
    fvec[3] = x[1] + x[2] + 2.0
    fvec
end;
res = fsolve(f!, [-2.1, 1.1], 3;method=:lmdif)
res.x

f! has an exact solution at [-3.0, 1.0]. I can see that the package does find the solution, since the println(x) statement inside f! function prints the correct values at the end. For some reason, the return of res.x is [-4.242640683506698, 1.4142135547100316], which correponds to the exact solution multiplied by sqrt(2).

The trace printed: 
[-2.1, 1.1]
[-2.0999999687075617, 1.1]
[-2.1, 1.1000000163912773]
[-3.000000002554485, 0.9999999994581396]
[-2.9999999578510015, 0.9999999994581396]
[-3.000000002554485, 1.0000000143593006]
[-3.0, 1.0]

the result shows:
Results of Nonlinear Solver Algorithm
 * Algorithm: Levenberg-Marquardt (expert)
 * Starting Point: [-2.1, 1.1]
 * Zero: [-4.242640683506698, 1.4142135547100316]
 * Inf-norm of residuals: 0.000000
 * Convergence: true
 * Message: relative error between two consecutive iterates is at most xtol
 * Total time: 0.013000 seconds
 * Function Calls: 0
 * Jacobian Calls (df/dx): 0

I've tried to find the error but could not. It might be inside the compiled dll.

sglyon commented 3 years ago

wow that's very strange and not good.

I'm surprised this hasn't come up anywhere else (including the tests). Do you have a proposed fix?

jibaneza commented 3 years ago

I think it should be somethign rather trivial inside the source code of C-MINPACK. It looks like the original MINPACK code has been modified to allow taking the trace as second argument:

    return_code = ccall(
        (:lmdif1, cminpack),
        Cint,
        (Ptr{Cvoid}, Ptr{Cvoid}, Cint, Cint, Ptr{Cdouble}, Ptr{Cdouble}, Cdouble, Ptr{Cint}, Ptr{Cdouble}, Cint),
        _lmdif1_cfunc, **pointer_from_objref(trace)**, m, n, x, fvec, tol, iwa, wa, lwa
    )

I would guess that some error was introduced doing that. Do you have the source code used to compile the libraries shipped with the package?

jibaneza commented 3 years ago

Adding to it, I noticed that res.x returns a different wrong result the first time the function is called [8.532e-316, 1.414] (must be related to JIT compilation). Also, depending on the initial value given to fsolve, it sometimes returns the correct answer. There must be some part of the code that overwrites the result. If you could share the modified version of cminpack I can try and take a look (I don't know much about C, so I might not find it, but I can try) 👍

sglyon commented 3 years ago

Hey @jibaneza thanks for taking a look.

I didn't modify the cminpack source when making this package.

I used the version on github here: https://github.com/devernay/cminpack

jibaneza commented 3 years ago

hey @sglyon , I found the issue I believe. The problem is in the function wrappers (I copy here the code for lmdif1 only, but the same occurs in lmdif, hybrd...etc) inside the function wrapper x is being used and modifies the value of x in the scope of the parent function. Changing the name of the variable or specifying local fixes the issue.

original code:

function _lmdif1_func_wrapper(_p::Ptr{Cvoid}, m::Cint, n::Cint, _x::Ptr{Cdouble},
                             _fvec::Ptr{Cdouble}, iflag::Cint)
    fvec = unsafe_wrap(Array, _fvec, m)
    x = unsafe_wrap(Array, _x, n)
    if iflag < 0
        print(fvec)
        return Cint(0)
    end
    f!(fvec, x)

    trace = unsafe_pointer_to_objref(_p)::AlgoTrace
    push!(trace, x, fvec, iflag)
    trace.f_calls > trace.maxit ? Cint(-1) : Cint(0)
end

fixed code:

function _lmdif1_func_wrapper(_p::Ptr{Cvoid}, m::Cint, n::Cint, _x::Ptr{Cdouble},
                             _fvec::Ptr{Cdouble}, iflag::Cint)
    fvec = unsafe_wrap(Array, _fvec, m)
    local x = unsafe_wrap(Array, _x, n)
    if iflag < 0
        print(fvec)
        return Cint(0)
    end
    f!(fvec, x)

    trace = unsafe_pointer_to_objref(_p)::AlgoTrace
    push!(trace, x, fvec, iflag)
    trace.f_calls > trace.maxit ? Cint(-1) : Cint(0)
end

sglyon commented 3 years ago

wow! I never would have expected that.

Which scope does x come from in the original code? The objective function somehow??

jibaneza commented 3 years ago

this is the relevant function in the case of lmdif1: x is defined first as x = copy(x0) and then it is modified inside the _lmdif1_func_wrapper as x = unsafe_wrap(Array, _x, n). This latter modifies the value of x in the scope of lmdif1 rather than creating a new local variable. PS: the same happens with fvec and trace.

function lmdif1(f!::Function, x0::Vector{Float64}, m::Int, tol::Float64,
                show_trace::Bool, tracing::Bool, maxit::Int)
    x = copy(x0)
    n = length(x)
    if n > m
        msg = "Must have at least as many variables as equations"
        throw(ArgumentError(msg))
    end

    fvec = Array{Float64}(undef, m)
    lwa = m*n+5*n+m
    iwa = Array{Int}(undef, n)
    wa = Array{Float64}(undef, lwa)
    trace = AlgoTrace(x0, show_trace, tracing, maxit)

    if show_trace
        show(trace)
    end

    function _lmdif1_func_wrapper(_p::Ptr{Cvoid}, m::Cint, n::Cint, _x::Ptr{Cdouble},
                                 _fvec::Ptr{Cdouble}, iflag::Cint)
        fvec = unsafe_wrap(Array, _fvec, m)
        x = unsafe_wrap(Array, _x, n)
        if iflag < 0
            print(fvec)
            return Cint(0)
        end
        f!(fvec, x)

        trace = unsafe_pointer_to_objref(_p)::AlgoTrace
        push!(trace, x, fvec, iflag)
        trace.f_calls > trace.maxit ? Cint(-1) : Cint(0)
    end
    _lmdif1_cfunc = @cfunction(
        $_lmdif1_func_wrapper,
        Cint,
        (Ptr{Cvoid}, Cint, Cint, Ptr{Cdouble}, Ptr{Cdouble}, Cint)
    )

    return_code = ccall(
        (:lmdif1, cminpack),
        Cint,
        (Ptr{Cvoid}, Ptr{Cvoid}, Cint, Cint, Ptr{Cdouble}, Ptr{Cdouble}, Cdouble, Ptr{Cint}, Ptr{Cdouble}, Cint),
        _lmdif1_cfunc, pointer_from_objref(trace), m, n, x, fvec, tol, iwa, wa, lwa
    )

    msg = _lmdif1_messages[max(-2, return_code)]
    if return_code < -1
        msg = msg * string(return_code)
    end
    converged = return_code in [1, 2, 3] || norm(fvec, Inf) <= tol
    trace.tot_time = time() - trace.start_time

    SolverResults("Levenberg-Marquardt", x0, x, fvec, return_code, converged, msg, trace)
end

sglyon / MINPACK.jl

Wrong solution return on :lm and :lmdif method (res.x returns solution multiplied by sqrt(2)) #7