JuliaNonconvex / Nonconvex.jl

Toolbox for gradient-based and derivative-free non-convex constrained optimization with continuous and/or discrete variables.
https://julianonconvex.github.io/Nonconvex.jl/
MIT License
112 stars 10 forks source link

custom adjoints #43

Closed ConnorMallon closed 3 years ago

ConnorMallon commented 3 years ago

What is the recommended way here to use a custom adjoint for an objective and/or constraints defined using ChainRulesCore.jl for the use in the value_jacobian function? Lets say I have defined my adjoint at the driver level and that I want that to be picked up rather than relying on the default autodiff. How should I tell Nonconvex I have the derivative information? If something like what is below worked that would be great:

using Nonconvex, LinearAlgebra, Test

function f(x::AbstractVector) 
    val = sqrt(x[2])
    jac = [0.5*x[1]^-0.5,0]
    val, jac
end

function g(x::AbstractVector,a,b)
    val = ( a*x[1] + b)^3 - x[2]
    jac = [3* (a)  *(x[1]+ (b) )^2,-1]
    val, jac 
end

options = MMAOptions(
    tol = Tolerance(kkt = 1e-6, f = 0.0),
    s_init = 0.1,
)

m = Model(f)
addvar!(m, [0.0, 0.0], [10.0, 10.0])
add_ineq_constraint!(m, x -> g(x,2,0))
add_ineq_constraint!(m, x -> g(x,-1,1))
alg = MMA87()
convcriteria = KKTCriteria()
r = Nonconvex.optimize(m, alg, [1.234, 2.345], options = options, convcriteria = convcriteria)
@test abs(r.minimum - sqrt(8/27)) < 1e-6
@test norm(r.minimizer - [1/3, 8/27]) < 1e-6

Thanks for your help

mohamed82008 commented 3 years ago

Hi Connor, glad to see you are giving Nonconvex a try. Gridap-TopOpt integration is still on my mind. Let me know if you want to help with that!

Re: your question, the way to define custom adjoints is using ChainRulesCore. Here is an example from TopOpt.jl https://github.com/mohamed82008/TopOpt.jl/blob/ad1cde7bfff891e9dd8ee03c928ba9b1fe946b3d/src/Functions/volume.jl#L92. So you would do something like this:

f(x::AbstractVector)  = sqrt(x[2])
function ChainRulesCore.rrule(::typeof(f), x::AbstractVector)
   val = f(x)
   grad = [0.0, 0.5*x[2]^-0.5]
   val, Δ -> (nothing, Δ * grad)
end

nothing is the adjoint of f wrt the function f itself because f is a simple function (not a closure or a callable struct). Δ * grad is the adjoint wrt x. I need to add this to the documentation.

ConnorMallon commented 3 years ago

Thanks Mohamed! That's good to hear, lets get it going. I didn't realise the rrule would be picked up like that. Very cool.

mohamed82008 commented 3 years ago

That's good to hear, lets get it going.

If you want to join a meeting with @yijiangh and I to talk specifics, I would be happy to arrange that. There is also the #topopt channel in the Julia slack for quick discussions.

I will close this issue for now.