JuliaNonconvex / Nonconvex.jl

Toolbox for gradient-based and derivative-free non-convex constrained optimization with continuous and/or discrete variables.
https://julianonconvex.github.io/Nonconvex.jl/
MIT License
110 stars 10 forks source link

Equality constraint hessian of vector valued function #168

Open mranneberg opened 3 months ago

mranneberg commented 3 months ago

Not sure if this is the right spot, sorry if not! I would like to state (in my case for an optimization problem I would like to solve with ipopt) the equality constraints as a vector. No problem doing that, as well as the gradient with "CustomGradFunction(f, g)". That I can only use "CustomHessianFunction(f, g, h)" for scalar equations is a bit of a hassle. Is there a way to use this, maybe via the hvp option? I know, it's annoying with these hessians of vector valued functions. For reference: https://juliadiff.org/ForwardDiff.jl/stable/user/advanced/#Hessian-of-a-vector-valued-function

mohamed82008 commented 3 months ago

Welcome! This is the right place. The hvp option is exactly for this kind of use case. Basically, the custom Hessian can also be a function that does a Hessian-vector product. https://github.com/JuliaNonconvex/NonconvexUtils.jl/blob/main/test/custom.jl#L23

mranneberg commented 3 months ago

Ah, great, I will try that next week and close! Thanks.

Out of curiosity: if the hvp option is enabled ipopt will not be able to directly solve/invert the Hessian of the lagrangian, so does this mean that it switches to an iterative linear problem solver? I could probably research this with the ipopt docu, but if that is so, it may still be advantageous to allow direct Hessian descriptions. Also, if this works, maybe the documentation could be dited to reflect that as it currently reads as another method for hessian definitions of scalar valued functions.

Edit: I did try and have an issue. Here is what I did:

function Hf(x,v)
        print(length(x))
        print(length(v))
        update(x)
        Ne = length(Eq)
        print(Ne)
        Ni = length(v)
        Hv = zeros(Ne,length(v))
        for k=1:Ne
            Hv[k,:] = reshape(ddEq[k,:,:],(Ni,Ni))*v
        end
        return Hv
end

This is with regardso to the "memoization" function of the other issue I opened. Regardless, the Hf function is then used like this:

E = CustomHessianFunction(eq, ∇eq,Heq; hvp = true)

I can call the Hf(x,v) function (with e.g. Hf(x,x)) and it has the size (62,73) where 62 is my number of equations and 73 the dimension of x. When using with optimize I get the error:

ERROR: DimensionMismatch: second dimension of A, 73, does not match length of x, 62
Stacktrace:
  [1] gemv!(y::Vector{Float64}, tA::Char, A::Matrix{Float64}, x::Vector{Float64}, α::Bool, β::Bool)
    @ LinearAlgebra C:\Users\maxra\scoop\apps\julia\current\share\julia\stdlib\v1.10\LinearAlgebra\src\matmul.jl:404

that is followed by a bunch of layers on matrix multiplication, then to Zygote, to map and vector of functions and so forth. I also tried returning Hv'. Anyways, I'll start now with the minimal example, but if you have a hunch it's appreciated.

mranneberg commented 3 months ago

Here goes the minimal example (which can also be used in the other issue I opened). There is a simple Newton loop to solve the lagrangian and there is the use case with IPOPT. Because I use the HessianFunction for equality constraints, I get a similar error to my actual problem above. I get the same results with IPOPT compared to the simple Newton loop if I do not use second order information (by setting first_order to true and uncommenting customGradFunction). If I do that though and comment out the "||true" I get the issue of not being able to update the function values within the calls of IPOPT.

N = 4

# A function that calculates Equation constraints and the target function
function fun(x)
    # Target 
    t = dot(x,x)
    dt = 2*x
    ddt = 2*diagm(ones(N))

    # Equation
    eq = [x[1]-pi/2;x[3]-sin(x[1])]
    deq = zeros(2,N)
    deq[1,1] = 1
    deq[2,1] = -cos(x[1])
    deq[2,3] = 1
    ddeq = zeros(2,N,N)
    ddeq[2,1,1] = sin(x[1])

    return t,dt,ddt,eq,deq,ddeq
end

# A memoization "helper" function
# Helper function / memoization
function fgen()
    eq = 0.0
    deq = 0.0
    ddeq = 0.0
    last_x = nothing
    t = 0.0
    dt = 0.0
    ddt = 0.0
    function update(x)
        if x != last_x || true
            t,dt,ddt,eq,deq,ddeq = fun(x)
            last_x = x
        end
        return
    end
    function f(x)
        update(x)
        return t
    end
    function ∇f(x)
        update(x)
        return dt
    end
    function Hf(x)
        update(x)
        return ddt
    end
    function g(x)
        update(x)
        return eq
    end
    function ∇g(x)
        update(x)
        return deq
    end

    function Hg(x)
        update(x)
        return ddeq
    end

    function Hgv(x,v)
        update(x)
        Hv = zeros(2,4)
        for k=1:2
            Hv[k,:] = reshape(ddeq[k,:,:],(4,4))*v
        end
        return Hv
    end
    return f, ∇f, Hf, g, ∇g, Hg, Hgv
end

# We can see the optimum is
# x = [pi/2;0;1;0]

function opt_lag()
    # Make functions
    f, ∇f, Hf, g, ∇g, Hg = fgen()

    # For reference: A Lagrangian solver
    z = zeros(6) # x, lambda
    print(z)
    print("\n")
    for iter=1:20
        x = z[1:4]
        lm = z[5:6]

        # Lagrangian derivatives
        ∇L = [∇f(x) - (lm'*∇g(x))';-g(x)]
        #print(∇L)
        HL = zeros(6,6)
        HL[1:4,1:4] = Hf(x)
        for k=1:2
            HG = Hg(x)
            HL[1:4,1:4] -= lm[k]*HG[k,:,:]
        end
        HL[1:4,5:6] = -∇g(x)'
        HL[5:6,1:4] = -∇g(x)

        # Newton Step
        z -= HL\∇L

        print(z)
        print("\n")
        if norm(∇L)<1e-9
            break
        end

    end
end

opt_lag()

# With Ipopt
using Nonconvex
Nonconvex.@load Ipopt

function opt_ipopt()
    # Make functions
    f, ∇f, Hf, g, ∇g, Hg, Hgv = fgen()

    x0 = zeros(4)

    T = CustomHessianFunction(f, ∇f, Hf)
    E = CustomHessianFunction(g, ∇g, Hgv;hvp=true)
    #E = CustomGradFunction(g, ∇g)
    model = Model(T)
    addvar!(model, -Inf*ones(4), Inf*ones(4),init = x0)
    add_eq_constraint!(model, E)

    alg = IpoptAlg()
    options = IpoptOptions(first_order = false,max_iter = 10)
    r = optimize(model, alg, x0, options = options)

end

sol = opt_ipopt()
print(sol.minimizer )