dthierry / k_aug

KKT matrix sensitivity and Reduced Hessian computation
BSD 3-Clause "New" or "Revised" License
7 stars 8 forks source link

Inconsistency between RH and dsdp updates #53

Open Robbybp opened 3 years ago

Robbybp commented 3 years ago

https://github.com/dthierry/k_aug/blob/6f451111bfbe1369af7a438d8d6a5ac26d147e71/src/k_aug/dot_driver/dot_driver.c#L376

In dot_driver, RH mode (npdp_strategy) performs the sensitivity update with ALPHA = -1.0, whereas dsdp mode performs the update with ALPHA = 1.0. How the code is currently written, I believe ALPHA = -1.0 is correct for both cases. This way the correct update will occur with

DeltaP = perturbed_value - original_value

in both cases. As currently written, the above is only correct for RH mode. DeltaP must have the opposite sign as above for dsdp mode. The reason we need this minus sign (ALPHA = -1.0) is that in k_aug, we calculate and store the negative sensitivity matrix. I propose we fix this by calculating the positive sensitivity matrix and switching ALPHA to positive one for both modes. This way perturbed - original can be used for both updates, which I think is logical.

Let me know if I have this wrong. What I'm saying here is based on my experience getting tests to pass in https://github.com/Pyomo/pyomo/pull/1613.

dthierry commented 3 years ago

I think you are right. I vaguely remember that there was actually a great reason for have ALPHA=1.0, but I can't remember it now.

Maybe we need to put that as a comment in the dot_driver.c and print a warning saying that this has changed now, because the MHE people are assuming that ALPHA = 1.0.

Robbybp commented 3 years ago

Yeah printing a warning is probably the right call. Do you know off the top of your head if the MHE people are using the opposite perturbation (original - perturbed) or the opposite constraint (param - var == 0)? I still want to take derivatives wrt sensitivity variables via ASL so we don't have to rely on the form of the constraint expression.

Robbybp commented 3 years ago

Also what do you think about storing the positive sensitivity matrix?

dthierry commented 3 years ago

They are doing original - perturbed. I think you can do that, if you declare the vector of p as variables and then pass their positions to k_aug with the suffixes. So assuming p appears linearly: You need to partition grad_c into [gradx_c^T gradp_c^T]^T, then permute either gradx_c or gradp_c so they match the positions of c (with suffixes). Then assemble the new K:

[hessXX L, gradx_c grad_c^T, 0] =

[0^T, gradp_c^T]^T

If you have a nonlinear form of p in c of f though, you need to do this also with hess_L:= [hess_XX, hess_XP, hess_PX, hess_PP] so you can extract the blocks that you want (hess_XX and hess_PX), I suppose you hace to do this anyway for the linear in p case.

So to summarise, you could use suffix magic to partition and re-assemble.

You could even partition K into small [K_i, ....] and do some decomposition stuff....

dthierry commented 3 years ago

Also what do you think about storing the positive sensitivity matrix?

What do you mean?

Robbybp commented 3 years ago

Yeah that's what I was thinking about for the variables - you'll just need to know which variables are sens vars, instead of which constraints.

Right now I think the matrix that is stored in dsdp_in_.in, assuming the constraint is written var - param == 0, is K^-1 R. The sensitivity matrix according to the implicit function theorem is -K^-1 R. So when I say "positive sensitivity matrix," I mean this latter quantity. If we stored this matrix, we would perform a "perturbed minus original" update with ALPHA=1.0, which I think makes more sense.

dthierry commented 3 years ago

I see. I agree, we should save that matrix instead.