Closed a-jp closed 1 year ago
The problem here is that the minimizer w.r.t x will depend on scalar, so if you change the value of scalar, you will have to recompute the minimizer in order to evaluate your objective ?
Thanks. That's ok. I appreciate it's very expensive but allows me to solve something without the worry on whether I've chosen an eps for finite difference that is overly affecting the answer. Would you be able to show me how to do it?
I am still not sure I understand our problem. I think that you want to solve the problem
minimize f(x) w.r.t. x
subject to L6 <= x6 <= U6
and then compute the partial of f(x) w.r.t. x6. This will just be the positive (negative) of the Lagrange multiplier for x6 at the optimal solution if L6 (U6) is active at the solution
I think what I'm trying to do, is the following:
x = guess, x is a vector
minimise Func(x, R) w.r.t to x, here R is a constant scalar for any given minimisation
Once I've solved the minimisation of Func
, which is a scalar value, and obtained x
, then I need to solve the following using the optimised x
:
Given: $$B = \sum{i} x{i}(R) b_{i}(R) $$
Compute:
$$ \frac{\partial}{\partial R} B = \sum{i} x{i}(R) \frac{\partial b{i}(R) }{\partial R} + \sum{i} b{i}(R) \frac{\partial x{i}(R) }{\partial R} $$
I've computed the first partial derivative using cppAD and compared this to a known result and it's exact, code to do so as follows:
const std::size_t n = 1;
std::vector<ADNumber> ax(n);
ax[0] = R; // constant known scalar as noted above
CppAD::Independent(ax);
const std::size_t m = 1;
std::vector<ADNumber> B(m);
B[0] = 0.0;
for (std::size_t s = 0; s < size; s++)
{
ADNumber b = 0.0;
Compute_b_term(ax[0], b); // this takes ax[0] as a const&, and b is taken as a non-const& and is assigned inside the function
// Note! x comes from solution.x (after ipopt call completes) and is simply a std::vector<double>, not an AD type
B[0] += x[s] * b; // here x is the solution vector coming from ipopt after minimization
}
CppAD::ADFun<double> f(ax, B);
// compute derivative using operation sequence stored in f
// std::vector<double> jac(m * n); // Jacobian of f (m by n matrix)
std::vector<double> val(n); // domain space vector
val[0] = R; // argument value for computing derivative
const auto &first_partial_derivative = f.Jacobian(val); // Jacobian for operation sequence
I've no idea how to compute the second partial derivative above...could you provide any advice? I assume it requires me to use similar code to the above, but to use Func
somehow? I don't know if I can just compute the value of Func
one more time with the optimised values of x
, outside of the optimiser and record this in AD types, and then use similar code to the above but now with B
as the size of x
? Bit stuck...
Suppose we are given the unconstrained poblem
$$ {\rm minimize} ~ f(x, R) \~ {\rm w.r.t} ~ x $$
Define $x(R)$$ as the solution corresponding to $R$, then this function can be defined implicitly by
$$ 0 = \partial_x f[ x(R), R] $$
because the derivative is zero at an unconstrained optimum.
The problem here is computing the derivatives of x(R) w.r.t. R. I suggest you see https://www.seanet.com/~bradbell/newton_step.htm
Looks like you're missing double-$ at the end of the line?
Many thanks for this, that's going to take some time to digest. I should have added, and omitted because I was unaware of where this was going, that my problem is a constrained optimisation problem (equality and inequality).
Can I ask another question, which is related as it came up setting up the finite difference version of this, which I should add seems to work and produces answers that are reported elsewhere for my problem. Using my nomenclature above, I am doing:
$$ \frac{\partial}{\partial R} B \approx \frac{ Func(x, R+eps) - Func(x, R) }{eps} $$
To get this to work using as a starting point this example and since I need to call the optimiser twice per derivative that I need, and I need those derivatives at a lot of R
s, I moved a lot of the code as possible to a one-off set up function, and then call the minimum amount when I need the derivative. Ideally no construction and destruction of the big objects, was my aim. I found two thing out:
retape = true
because within the functions that get called in the optimiser I do things like if R < R_1
. I believe the way to fix this is to use CondExpLt, which I'm working on to fix. Am I correct?x_i
to a guess, as required. I then as per the above reasoning, do a lot of one-off set up. Outside of computing derivatives and just normal optimisation calls, I've found previously with my python implementation using ipopt, that I do sometimes need to make a better initial guess. There is also a use case for a better initial guess after setup in the above partial derivative, since the result of the first optimiser call can be used as the initial guess for the second call, to reduce the number of iteration in ipopt. I've found, that once the Ipopt::TNLP
object is constructed, no further changes to the initial values of x_i
can be made use of. The values used for x_i
are the values used on construction of Ipopt::TNLP
. Have I figured that out correctly? Is that integral to how it works, or something that can be changed?Many thanks,
- To vary R which I must do, not only for perturbing by eps (~1.0e-08), and for different R's, I need to ensure
retape = true
because within the functions that get called in the optimiser I do things likeif R < R_1
. I believe the way to fix this is to use CondExpLt, which I'm working on to fix. Am I correct?
Yes
- This second one confused me. I do initialise
x_i
to a guess, as required. I then as per the above reasoning, do a lot of one-off set up. Outside of computing derivatives and just normal optimisation calls, I've found previously with my python implementation using ipopt, that I do sometimes need to make a better initial guess. There is also a use case for a better initial guess after setup in the above partial derivative, since the result of the first optimiser call can be used as the initial guess for the second call, to reduce the number of iteration in ipopt. I've found, that once theIpopt::TNLP
object is constructed, no further changes to the initial values ofx_i
can be made use of. The values used forx_i
are the values used on construction ofIpopt::TNLP
. Have I figured that out correctly? Is that integral to how it works, or something that can be changed?
Are you using ipopt_solve for your optimization ? https://coin-or.github.io/CppAD/doc/ipopt_solve.htm
I am going to move this issue to a discussion.
Hi, I'm new to cppad, but not new to ipopt or the concept of AD. In trying to explain my question, I was going to use this example. My question (sorry if title is not quite right), let's assume that the objective also depended on another scalar:
I'd like to understand how to obtain the jacobian representing d(xi)/d(scalar), but here the xi is only obtained after minimising via ipopt. Is that possible, does my question make sense? scalar can be any type as needed. So I guess in finite difference terms, I'd compute xi(scalar+eps) via an ipopt call and then xi(scalar) via another ipopt call, then compute the FD Jac as (xi(scalar+eps) - xi(scalar))/(eps). I'm hoping instead to do this by AD...
Many thanks, Andy