Open jameskermode opened 7 years ago
Good point - do you have a test system?
A constrained bond length would be the first canonical example. I will set something up and let you know if/when there are problems.
there is now a new ConstrainedOptim which we can try at some point. But I'm assuming the basic Newton scheme we have in Experimental
works for now?
@lifelemons is still having robustness problems with the experimental constrained Newton scheme, even with the gradient-based linesearch, so I think we will give ConstrainedOptim
a go.
(Oops, this is @jameskermode, signed in with wrong GitHub account)
This is the crack problem, yes? Maybe @lifelemons could remind me how I can access it and I’ll have another look?
Here is an attempt at using ConstrainedOptim.jl
to fix the length of a bond in a piece of bulk SW silicon with latest JuLIP. The optimiser converges and penalty goes to zero, but the constraint is not satisfied afterwards within the target threshold of +/- 1e-2, so I must be doing something wrong. I've tried increasing strength of initial penalty.
@cortner would you be willing to take a look?
Final output I get is
b = Any[2.35126, 2.39, 2.35126, 2.35126]
E = Any[-34.68, -34.6702, -34.68, -34.68]
using ConstrainedOptim
using JuLIP
a = bulk(:Si, cubic=true)
set_constraint!(a, FixedCell(a))
set_calculator!(a, StillingerWeber())
i = 1
j = 2
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
x0 = dofs(a)
@show I1
@show I2
I1I2 = [I1; I2]
blen(x) = norm(x[I2] - x[I1])
# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)
b = []
E = []
for target_bondlength = 2.3:0.1:2.6
# constraint function
con_c!(c, x) = (c[1] = blen(x) - target_bondlength; c)
# Jacobian of constraint, shape [1, length(x)]
function con_jacobian!(J, x)
J[1,I1] = (x[I1]-x[I2])/blen(x)
J[1,I2] = (x[I2]-x[I1])/blen(x)
J
end
# Hessian contribution of constraint
# We define auxiliary index arrays _I1 and _I2 that start at 1 to
# allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
_I1 = 1:length(I1)
_I2 = length(I1)+1:length(I1)+length(I2)
_blen(x) = norm(x[_I2] - x[_I1])
_c(x) = _blen(x) - target_bondlength
function con_h!(h, x, λ)
h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
end
df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
lx = Float64[]; ux = Float64[]
lc = [-1e-2]; uc = [1e-2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
lx, ux, lc, uc)
# move atoms i and j so constraint is initially satisfied
set_dofs!(a, x0)
X = positions(a)
u = a[j] - a[i]
f = 1.0 - target_bondlength / norm(u)
X[i] += 0.5 * f * u
X[j] -= 0.5 * f * u
set_positions!(a, X)
x = dofs(a)
@show target_bondlength, blen(x)
res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto),
Optim.Options(show_trace = true,
allow_f_increases = true,
successive_f_tol = 2))
@show res
@show _c(res.minimizer[I1I2])
@show blen(res.minimizer)
push!(b, blen(res.minimizer))
push!(E, ener(res.minimizer))
end
@show b
@show E
Think I've got it working, mistake was not zeroing all elements of constraint Jacobian:
function con_jacobian!(J, x)
J[1,:] = 0.0
J[1,I1] = (x[I1]-x[I2])/blen(x)
J[1,I2] = (x[I2]-x[I1])/blen(x)
end
What about the hessian, would one need to zero that out too? Does that now show something more reasonable?
let me know if I still need to look at it?
if this works, then it might be time to discuss how to best implement a more generic set of constraints that can be conveniently accessed? I.e. #20
No, slightly confusingly the constraint Hessian contribution is additive, so that goes on top of the one from the objective, while the constraint Jacobian needs the complete vector.
Thanks Christoph- working fine now, so no need to look at, but agree it would be good to implement a generalisation of the function into JuLIP in a more convenient way.
function minimize_constrained_bond!(a, i, j, bondlength)
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
I1I2 = [I1; I2]
# bondlength between atoms i and j
blen(x) = norm(x[I2] - x[I1])
# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)
# constraint function
con_c!(c, x) = (c[1] = blen(x) - bondlength; c)
# Jacobian of constraint, shape [1, length(x)]
function con_jacobian!(J, x)
J[1,:] = 0.0
J[1,I1] = (x[I1]-x[I2])/blen(x)
J[1,I2] = (x[I2]-x[I1])/blen(x)
end
# Hessian contribution of constraint
# We define auxiliary index arrays _I1 and _I2 that start at 1 to
# allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
_I1 = 1:length(I1)
_I2 = length(I1)+1:length(I1)+length(I2)
_blen(x) = norm(x[_I2] - x[_I1])
_c(x) = _blen(x) - bondlength
function con_h!(h, x, λ)
h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
end
df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
lx = Float64[]; ux = Float64[]
lc = [-1e-2]; uc = [1e-2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
lx, ux, lc, uc)
# move atoms i and j so constraint is initially satisfied
X = positions(a)
u = X[j] - X[i]
f = 1.0 - bondlength / norm(u)
X[i] += 0.5 * f * u
X[j] -= 0.5 * f * u
set_positions!(a, X)
x = dofs(a)
res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto),
Optim.Options(show_trace = true,
extended_trace = false,
allow_f_increases = true,
successive_f_tol = 2))
set_dofs!(a, res.minimizer)
end
Meanwhile, @lifelemons please give this a go for your use case and let me know of any problems
This is just a reminder to go back and try JuLIP with the augmented Lagrangian implementation of constrained optimisation from https://github.com/cortner/ConstrainedOptim.jl
Related to issue #33