coin-or / Ipopt

COIN-OR Interior Point Optimizer IPOPT
https://coin-or.github.io/Ipopt
Other
1.35k stars 271 forks source link

Partial derivative of output vector with respect to input scalar parameters at optimum #749

Open a-jp opened 4 months ago

a-jp commented 4 months ago

I would like to ask for help, I don’t know the name of the method I’m trying to find, but I think it’s called sensitivity analysis.
I have two input parameters to my optimisation problem, A and B, or B and C, as described below. I provide an initial guess for the vector X, and at the minimum solved for by IPOPT the optimum values of the vector X are returned and is the solution to my problem.

IPOPT is not aware of the input pairs specifically (A and B, or B and C) because I don't know how to do that, but my functions use them to solve the problem and they are constant scalars. I have a function, G(A, B), which I solve for the minimum of using IPOPT (with some equality constraints). At convergence I obtain the solution to my problem, the vector X. X is X(A, B). This works well and my code behaves as expected.

For downstream analysis in another code, I now need partial derivatives at the optimum point, specifically dX/dA_const_B and dX/dB_const_A. I had wondered/hoped that as part of the solution mechanism I might be able to obtain these derivative vectors at convergence in IPOPT? Note, I do not want to do sensitivity analysis, and it’s these raw derivative vectors that I need. Is it possible to obtain these partial derivative vectors from IPOPT in this scenario?

Secondly, I solve another type of optimisation problem in IPOPT, in this case my input parameters are B, and C. I have a function H(B, C), which I solve for the minimum of using IPOPT (with some equality constraints). At convergence I obtain the solution to my problem, the vector X and the scalar A. Here A is the same parameter A in the first example, but here is an output rather than an input. Here X is X(B, C) and A is A(B, C).

As before, I now need partial derivatives at the optimum point, specifically dX/dA_const_B and dX/dB_const_A. In this second scenario, A(B, C) has been obtained at the optimum point as an output, whereas B was an input parameter. Is it possible to obtain these partial derivatives from IPOPT in this second scenario?

I hope this makes sense, I’m happy to answer any questions to clarify. Been using IPOPT for sometime so I have a good handle that my code works, but these derivative vectors as a function of inputs, or in the second scenario other scalar outputs, are a new requirement.

svigerske commented 4 months ago

There is sIpopt: https://coin-or.github.io/Ipopt/SPECIALS.html#SIPOPT
However, I haven't heard from the developer for a while and it is probably not much tested. But you could take a look at it anyway and it may just do what you need.

This is available for the C++ and AMPL interfaces only. The parametric_cpp example should be the C++ implementation for the example in the docu, which gives some hints on how to use it. Essentially, you have to declare your parameters A,B,C as free variables and then specify in the metadata (TNLP::get_var_con_metadata()) which variables belong to parameters and what value they have. This means, that your derivative callbacks will need to include derivatives of your functions (objective, constraints) with respect to these variables.

Class SensApplication has a method GetDirectionalDerivatives(). Maybe this gives you the derivates you seek for. Reading the corresponding paper may help to understand how this code work.

a-jp commented 4 months ago

Hi, thank you for this. I'll take a read. This would be quite integral to what I need to do so I guess I was probably looking for something more supported/integrated - but I'll take a look. Currently I just call my solve twice and perform a finite difference.... I'm using CppAD with IPOPT. @bradbell is the above something that you think I could obtain from CppAD?

bradbell commented 4 months ago

IPOPT is not aware of the input pairs specifically (A and B, or B and C) because I don't know how to do that, but my functions use them to solve the problem and they are constant scalars. I have a function, G(A, B), which I solve for the minimum of using IPOPT (with some equality constraints). At convergence I obtain the solution to my problem, the vector X. X is X(A, B). This works well and my code behaves as expected.

I think that you can get what you want for a parameter A by making it a variable in the optimization problem, constraining it to the value you want and then using the corresponding Lagrange multiplier; see Interpretation of Lagrange Multipliers on https://en.wikipedia.org/wiki/Lagrange_multiplier#Interpretation_of_the_Lagrange_multipliers

a-jp commented 4 months ago

@bradbell thank you for this. To allow for a concrete example to help my understanding, could either @bradbell or @svigerske provide an example of how to obtain the Lagrange multipliers from a successful IPOPT optimisation? I know how to add a variable and use make_parameter with the same upper and lower bound to make it fixed for the optimisation, but how do I extract these derivatives as a function of that parameter?

@bradbell is there a way to do this in CppAD such that I can compute a reference solution to compare against the derivatives coming from IPOPT or debugging etc? Thanks

svigerske commented 4 months ago

The Lagrangian multipliers are the dual solution values that you get in the finalize_solution call, in particular, the one for the variable bounds (z_L, z_U) are relevant here. But I think they only give you a derivative of the objective function value (G/H) with respect to your parameter (A,B,C), not of the solution point (X) w.r.t. parameters.

bradbell commented 4 months ago

Is your problem smooth and can you think of the optimum as the solution of the an implicit function where the gradient of the objective is equal to zero. If so perhaps you can apply the implicit function theorem to get what you want ? https://en.wikipedia.org/wiki/Implicit_function_theorem

a-jp commented 4 months ago

@bradbell thanks. I guess I could consider that. Being pragmatic though, since I've gone through the effort to solve a CppAD/IPOPT problem it would be great to use information already computed to get me what I need.

The Lagrangian multipliers are the dual solution values that you get in the finalize_solution call, in particular, the one for the variable bounds (z_L, z_U) are relevant here. But I think they only give you a derivative of the objective function value (G/H) with respect to your parameter (A,B,C), not of the solution point (X) w.r.t. parameters.

@bradbell did you have a different take on this, as you originally suggested the Lagrangian multipliers?

@svigerske thanks. I would very much need dX_dA/B so that will not work I guess....

a-jp commented 1 week ago

Hi, hoping to kick start this again. Felt like there was going to be a solution here particularly if it could be determined that ipopt did indeed have the partial derivatives at convergence as part of its solution process. Can anyone help reinvigorate this to find an answer? Thanks