algorithmsbooks / optimization

Errata for Algorithms for Optimization book
67 stars 16 forks source link

Page 174 #93

Open xulongwu4 opened 1 year ago

xulongwu4 commented 1 year ago

When generalizing the Lagrange multipliers to the case with multiple equality constraints, I find the derivation in the book incorrect (please correct me if I am wrong). The book constructed a new constraint h_comb = h1^2 + h2^2, and apply the Lagrange multipliers to this single constraint.

However, the function h_comb has zero gradient at the critical point. This is because h1=0 and h2=0 at the crital point. Since \nabla(h_comb) is zero, the Lagrange multiplier method does not apply any more, since we can't claim \nabla(f) = \lambda * \nabla(h_comb) in this case.

I can construct a simple example to demonstrate this point. Consider the problem of minimizing f(x, y) = x^2 + 2y subject to the condition x - y = 0.

If we apply the Lagrange multipliers to the constraint, one can easily get the critical point x = -1, y = -1.

However, if we choose to write the constraint as (x - y) ^ 2 = 0, and then apply the lagrange multipliers method, we will get the following system of equations:

2x + 2\lambda (x - y) = 0
2 - 2\lambda (x - y) = 0
(x - y)^2 = 0

This system of equations has no viable solutions.

tawheeler commented 1 year ago

Hello xulongwu4. The issue that you point out would seem to apply just as well to the normal Lagrange method of multipliers, i.e. simply applying the method of Lagrange multipliers to your problem with the squared constraint.

We can still apply the method of Lagrange multipliers to minimize x^2 + 2y subject to (x-y)^2 = 0 by using \lambda \nabla f - \nabla h = 0 instead. (This is roughly the same as dividing by nabla, but really its just a different way of constraining nabla f and nabla h to be parallel).

This results in: 2 \lambda x - 2x + 2y = 0 2 \lambda + 2x - 2y = 0 (x-y)^2 = 0

which has the solution x=y=-1, lambda = 0.

I like this issue though, and I think I'll work that into an exercise for this chapter. Thank you!

xulongwu4 commented 1 year ago

@tawheeler, thanks for the quick reply.

I think it is interesting that we can attempt to solve the problem by \lambda \nabla f - \nabla h = 0. I was able to arrive at the system of equations that you provided, and that is great.

2 \lambda x - 2x + 2y = 0       (1)
2 \lambda + 2x - 2y = 0        (2)
(x-y)^2 = 0                       (3)

However, I don't see see how we can get the result x = y = -1. Equation (3) gives x = y, and if I pull that result into either equation (1) or (2), I can get lambda = 0, but can't infer the value of either x or y. It seems that the solution to the given equations is lamba = 0, x = y = anything. Thus it still appears to me that the Lagrange multipliers method can't apply when we have the squared constraint?

tawheeler commented 1 year ago

It looks like you are right! When the critical point is at nabla h = 0, then lambda is zero and multiple solutions exist. At that point we might substitute x = y into our objective and solve the 1d problem.

Mykel and I might be able to come up with a different way to simply derive the method of Lagrange multipliers for multiple constraints, without using the sum-of-squares combination.

tawheeler commented 1 year ago

Hello xulongwu4. Would you like to share your name so that we can credit you in the acknowledgements?

xulongwu4 commented 1 year ago

@tawheeler, you can credit Longwu Ou for this issue. Thanks!