JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.1k stars 214 forks source link

Interior point Newton violates the non-linear constraint in high-dimensional problems #1050

Open dmsorokin opened 10 months ago

dmsorokin commented 10 months ago

Hi! I encountered this problem while working on a rather complicated optimization problem in 200+ dimensions: the result of a 'successful' optimization violated the constraint. I use IPNewton following this tutorial, but I rely on forward differentiation instead of calculating the gradient and the Hessian. The issue seems to be related to #813.

Here is a simple example I was able to cook up that demonstrates the failure of the optimization problem as the number of dimensions grows. The constraint is not violated in the example, but the solution is definitely wrong. The example below solves the problem correctly when n_dims = 2 but fails when n_dims = 200.

I hope this is helpful! Thanks.

using Optim

# Maximize the product x_1 * x_2 * ... * x_n over the circle of radius 1
# The solution is x_1 = x_2 = ... = x_n = 1/sqrt(n)
function objective_func(x)
    -prod(x)
end

constraint!(c, x) = (c[1] = sum(x.^2); c)

n_dims = 200
initial_x = ones(n_dims)./sqrt(2*n_dims) # interior point of the feasible region

lx = zeros(n_dims) # lower bound on x
ux = ones(n_dims) .* 5 # upper bound on x (arbitrary)
lc = [0.0] # lower bound on constraint
uc = [1.0] # upper bound on constraint

dfc = TwiceDifferentiableConstraints(constraint!, lx, ux, lc, uc)
df = TwiceDifferentiable(objective_func, initial_x)

# solve the optimization problem
result = optimize(df, dfc, initial_x, IPNewton(), autodiff = :forward)

result.minimizer # each entry should be approximately 0.07071 but it is not
200-element Vector{Float64}:
 0.05
 0.05
 0.05
...

# The real solution is on the frontier, not within
constraint!([1.0], result.minimizer)
1-element Vector{Float64}:
 0.5000000000000002
pkofod commented 9 months ago

thanks for reporting the issue, I will try to find time to look at this

lalondeo commented 4 months ago

It is true that the solution returned by the optimizer is wrong, but I don't think there's a real way around this problem. The issue is that the gradient of the objective function is identically zero to within numerical precision, which causes the solver to proclaim the initial point to be optimal. There's not much that can be done about that except for fiddling with the objective to make it more well-behaved numerically, which is the user's responsibility.

Replacing the function objective_func in the above code snippet with the following function objective_func(x) -sum(log.(x)) end

results in an equivalent optimization problem which does have a well-behaved objective function. Strangely enough, the solver now returns a point which violates the norm constraint: for example, for n_dim = 2, the result is [0.75, 0.75], when the actual minimizer is [0.707..., 0.707...].