Open mranneberg opened 3 months ago
Welcome! This is the right place. The hvp
option is exactly for this kind of use case. Basically, the custom Hessian can also be a function that does a Hessian-vector product. https://github.com/JuliaNonconvex/NonconvexUtils.jl/blob/main/test/custom.jl#L23
Ah, great, I will try that next week and close! Thanks.
Out of curiosity: if the hvp option is enabled ipopt will not be able to directly solve/invert the Hessian of the lagrangian, so does this mean that it switches to an iterative linear problem solver? I could probably research this with the ipopt docu, but if that is so, it may still be advantageous to allow direct Hessian descriptions. Also, if this works, maybe the documentation could be dited to reflect that as it currently reads as another method for hessian definitions of scalar valued functions.
Edit: I did try and have an issue. Here is what I did:
function Hf(x,v)
print(length(x))
print(length(v))
update(x)
Ne = length(Eq)
print(Ne)
Ni = length(v)
Hv = zeros(Ne,length(v))
for k=1:Ne
Hv[k,:] = reshape(ddEq[k,:,:],(Ni,Ni))*v
end
return Hv
end
This is with regardso to the "memoization" function of the other issue I opened. Regardless, the Hf function is then used like this:
E = CustomHessianFunction(eq, ∇eq,Heq; hvp = true)
I can call the Hf(x,v) function (with e.g. Hf(x,x)) and it has the size (62,73) where 62 is my number of equations and 73 the dimension of x.
When using with optimize
I get the error:
ERROR: DimensionMismatch: second dimension of A, 73, does not match length of x, 62
Stacktrace:
[1] gemv!(y::Vector{Float64}, tA::Char, A::Matrix{Float64}, x::Vector{Float64}, α::Bool, β::Bool)
@ LinearAlgebra C:\Users\maxra\scoop\apps\julia\current\share\julia\stdlib\v1.10\LinearAlgebra\src\matmul.jl:404
that is followed by a bunch of layers on matrix multiplication, then to Zygote, to map and vector of functions and so forth. I also tried returning Hv'
.
Anyways, I'll start now with the minimal example, but if you have a hunch it's appreciated.
Here goes the minimal example (which can also be used in the other issue I opened). There is a simple Newton loop to solve the lagrangian and there is the use case with IPOPT. Because I use the HessianFunction for equality constraints, I get a similar error to my actual problem above. I get the same results with IPOPT compared to the simple Newton loop if I do not use second order information (by setting first_order to true and uncommenting customGradFunction). If I do that though and comment out the "||true" I get the issue of not being able to update the function values within the calls of IPOPT.
N = 4
# A function that calculates Equation constraints and the target function
function fun(x)
# Target
t = dot(x,x)
dt = 2*x
ddt = 2*diagm(ones(N))
# Equation
eq = [x[1]-pi/2;x[3]-sin(x[1])]
deq = zeros(2,N)
deq[1,1] = 1
deq[2,1] = -cos(x[1])
deq[2,3] = 1
ddeq = zeros(2,N,N)
ddeq[2,1,1] = sin(x[1])
return t,dt,ddt,eq,deq,ddeq
end
# A memoization "helper" function
# Helper function / memoization
function fgen()
eq = 0.0
deq = 0.0
ddeq = 0.0
last_x = nothing
t = 0.0
dt = 0.0
ddt = 0.0
function update(x)
if x != last_x || true
t,dt,ddt,eq,deq,ddeq = fun(x)
last_x = x
end
return
end
function f(x)
update(x)
return t
end
function ∇f(x)
update(x)
return dt
end
function Hf(x)
update(x)
return ddt
end
function g(x)
update(x)
return eq
end
function ∇g(x)
update(x)
return deq
end
function Hg(x)
update(x)
return ddeq
end
function Hgv(x,v)
update(x)
Hv = zeros(2,4)
for k=1:2
Hv[k,:] = reshape(ddeq[k,:,:],(4,4))*v
end
return Hv
end
return f, ∇f, Hf, g, ∇g, Hg, Hgv
end
# We can see the optimum is
# x = [pi/2;0;1;0]
function opt_lag()
# Make functions
f, ∇f, Hf, g, ∇g, Hg = fgen()
# For reference: A Lagrangian solver
z = zeros(6) # x, lambda
print(z)
print("\n")
for iter=1:20
x = z[1:4]
lm = z[5:6]
# Lagrangian derivatives
∇L = [∇f(x) - (lm'*∇g(x))';-g(x)]
#print(∇L)
HL = zeros(6,6)
HL[1:4,1:4] = Hf(x)
for k=1:2
HG = Hg(x)
HL[1:4,1:4] -= lm[k]*HG[k,:,:]
end
HL[1:4,5:6] = -∇g(x)'
HL[5:6,1:4] = -∇g(x)
# Newton Step
z -= HL\∇L
print(z)
print("\n")
if norm(∇L)<1e-9
break
end
end
end
opt_lag()
# With Ipopt
using Nonconvex
Nonconvex.@load Ipopt
function opt_ipopt()
# Make functions
f, ∇f, Hf, g, ∇g, Hg, Hgv = fgen()
x0 = zeros(4)
T = CustomHessianFunction(f, ∇f, Hf)
E = CustomHessianFunction(g, ∇g, Hgv;hvp=true)
#E = CustomGradFunction(g, ∇g)
model = Model(T)
addvar!(model, -Inf*ones(4), Inf*ones(4),init = x0)
add_eq_constraint!(model, E)
alg = IpoptAlg()
options = IpoptOptions(first_order = false,max_iter = 10)
r = optimize(model, alg, x0, options = options)
end
sol = opt_ipopt()
print(sol.minimizer )
Not sure if this is the right spot, sorry if not! I would like to state (in my case for an optimization problem I would like to solve with ipopt) the equality constraints as a vector. No problem doing that, as well as the gradient with "CustomGradFunction(f, g)". That I can only use "CustomHessianFunction(f, g, h)" for scalar equations is a bit of a hassle. Is there a way to use this, maybe via the hvp option? I know, it's annoying with these hessians of vector valued functions. For reference: https://juliadiff.org/ForwardDiff.jl/stable/user/advanced/#Hessian-of-a-vector-valued-function