JuliaMolSim / JuLIP.jl

Julia Library for Interatomic Potentials
Other
83 stars 23 forks source link

Augmented Lagrangian #47

Open jameskermode opened 7 years ago

jameskermode commented 7 years ago

This is just a reminder to go back and try JuLIP with the augmented Lagrangian implementation of constrained optimisation from https://github.com/cortner/ConstrainedOptim.jl

Related to issue #33

cortner commented 7 years ago

Good point - do you have a test system?

jameskermode commented 7 years ago

A constrained bond length would be the first canonical example. I will set something up and let you know if/when there are problems.

cortner commented 6 years ago

there is now a new ConstrainedOptim which we can try at some point. But I'm assuming the basic Newton scheme we have in Experimental works for now?

IL027 commented 6 years ago

@lifelemons is still having robustness problems with the experimental constrained Newton scheme, even with the gradient-based linesearch, so I think we will give ConstrainedOptim a go.

IL027 commented 6 years ago

(Oops, this is @jameskermode, signed in with wrong GitHub account)

cortner commented 6 years ago

This is the crack problem, yes? Maybe @lifelemons could remind me how I can access it and I’ll have another look?

jameskermode commented 6 years ago

Here is an attempt at using ConstrainedOptim.jl to fix the length of a bond in a piece of bulk SW silicon with latest JuLIP. The optimiser converges and penalty goes to zero, but the constraint is not satisfied afterwards within the target threshold of +/- 1e-2, so I must be doing something wrong. I've tried increasing strength of initial penalty.

@cortner would you be willing to take a look?

Final output I get is

b = Any[2.35126, 2.39, 2.35126, 2.35126]
E = Any[-34.68, -34.6702, -34.68, -34.68]
using ConstrainedOptim
using JuLIP

a = bulk(:Si, cubic=true)
set_constraint!(a, FixedCell(a))
set_calculator!(a, StillingerWeber())

i = 1
j = 2
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
x0 = dofs(a)

@show I1
@show I2

I1I2 = [I1; I2]

blen(x) = norm(x[I2] - x[I1])

# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)

b = []
E = []

for target_bondlength = 2.3:0.1:2.6

    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - target_bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
        J
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - target_bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    set_dofs!(a, x0)
    X = positions(a)
    u = a[j] - a[i]
    f = 1.0 - target_bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    @show target_bondlength, blen(x)

    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true, 
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    @show res
    @show _c(res.minimizer[I1I2])
    @show blen(res.minimizer)

    push!(b, blen(res.minimizer))
    push!(E, ener(res.minimizer))
end

@show b
@show E
jameskermode commented 6 years ago

Think I've got it working, mistake was not zeroing all elements of constraint Jacobian:

function con_jacobian!(J, x)
    J[1,:] = 0.0
    J[1,I1] = (x[I1]-x[I2])/blen(x)
    J[1,I2] = (x[I2]-x[I1])/blen(x)
end
lifelemons commented 6 years ago

What about the hessian, would one need to zero that out too? Does that now show something more reasonable?

cortner commented 6 years ago

let me know if I still need to look at it?

if this works, then it might be time to discuss how to best implement a more generic set of constraints that can be conveniently accessed? I.e. #20

jameskermode commented 6 years ago

No, slightly confusingly the constraint Hessian contribution is additive, so that goes on top of the one from the objective, while the constraint Jacobian needs the complete vector.

jameskermode commented 6 years ago

Thanks Christoph- working fine now, so no need to look at, but agree it would be good to implement a generalisation of the function into JuLIP in a more convenient way.

function minimize_constrained_bond!(a, i, j, bondlength)
    I1 = atomdofs(a, i)
    I2 = atomdofs(a, j)
    I1I2 = [I1; I2]

    # bondlength between atoms i and j
    blen(x) = norm(x[I2] - x[I1])

    # energy, gradient and hessian for the interatomic potential
    ener(x) = energy(a, x)
    gradient!(g, x) = (g[:] = gradient(a, x); g)
    hessian!(h, x) = (h[:,:] = hessian(a, x); h)

    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,:] = 0.0
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    X = positions(a)
    u = X[j] - X[i]
    f = 1.0 - bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)

    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true,
                                  extended_trace = false,
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    set_dofs!(a, res.minimizer)
end

Meanwhile, @lifelemons please give this a go for your use case and let me know of any problems