Open OFR-IIASA opened 1 year ago
Here are my promised 2ct to the discussion on duality at today's message-ix mtg. I am afraid that application of bounds (assuming at least some of them are activate at the solution) may rather hide the problem than solve it. Such bounds decrease the feasible polyhedral; therefore, the solution will differ from solution of the problem without bounds. The latter might not be a problem because bounding annual emissions may have advantages. However, the approach will not improve "stability" of dual solution; for the latter there is unfortunately no robust solution.
The actual problem comes from the optimization's features, which are generally known but often forgotten. Namely, solvers provide optimal solution that minimizes objective; solvers don't care about stability of solutions unless dedicated approaches, such as regularization, are implemented. Thus, for a given feasible set, the objective values for different runs (with different solvers/options) should be "almost the same". However, the corresponding solutions (in the spaces of variables, both primal and dual) usually differ substantially; the latter occurs especially for dual variables. IPM (barrier method) is primal-dual; therefore, chances for smaller differences in dual variables are higher than for simplex methods. This advantage, however, is likely lost, if crossover is applied to optimal IPM solution to move solution from inside the feasible region to a vertex. Values of dual solutions before and after crossover will almost certainly differ substantially.
Summing-up: if we use dual solutions (which, obviously, is a common practice despite the above summarized problems) we should be prepared for facing the consequences of its features. These are likely to be even much more serious for badly conditioned problems.
I am sorry for stressing the issue without being able to propose a constructive solution. However, maybe my comments will anyway be somehow useful.
Thank you, Marek, for your important comments. I understand the optimal primal/dual variables are not necessarily unique when the corresponding coefficients (c_i of c_ix_i, or b_j of b_jy_j) are zero. My question is whether the optimal variables with non-zero coefficients can be unstable or not. Could you give me any comments about that?
@th-hara : The non-uniqueness of both primal and dual solutions is caused mainly by the fact, that solutions differ (often substantially) for "close" values of objective function. The "mainly" stands for "an obvious explanation"; other reasons include numerical properties of the MMP (e.g., the condition number) or faults in the MMP formulation that "obviously" cause non-uniqueness.
I don't know why/how zero/non-zero values of any coefficient could explain the non-uniqueness of solutions. Actually, optimization stops, when optimality and feasibility conditions are met; this occurs at different points (in terms of values of model variables) of the solution space.
@marek-iiasa , Thank you for your reply. I understand your explanation of the cause of substantial variety of solutions with almost the same optimal function value. Intuitively, if the constraint which includes the extreme point as the optimal solution is close to "parallel" to the objective function, this could happen, is it correct? In the context of energy systems analysis, if some technologies have almost the same cost, this could happen. I have experience of analyzing "near-optimal solutions" to explore a wide variety of technical options with almost the same system cost.
I meant that degeneracy cause the non-uniqueness of solutions, and one of the causes of degeneracy is the zero-value of parameters. The example is, if you have a power system model with hourly time resolution, the variables which represent the amount of charge/discharge of storage at each time must have non-unique values in the optimal solution. This is because the cost coefficients of such variables are zero in the objective function.
@th-hara: we just touch a rather complex problem; therefore I would be careful with general explanations. On one hand we have typical LP or NLP examples shown in the books or lectures on math-programming, which are "well behaving" and thus serve for easy explanations of e.g., of the KKT optimality conditions; such examples have, of course, unique solutions (both primal and dual), and a "clear" optimal point. For illustrating degeneracy one typically shows a small example of two-variables' problems with three constraints (all having clearly different gradients, also different from the gradient of the objective), when all these constraints are active at the optimal point. In such cases the primal solution is unique but the dual is not.
On the other hand, the reality is of course much more complicated. Your arguments illustrate a subset of problems with uniqueness of solutions of optimization problems. Generally, large problems have typically flat boundary of the set of feasible solutions, which actually corresponds to the set of suboptimal solutions. The optimization stops somewhere at such flat area where the gradient of the objective is close to the gradients of constraints of such area. In other words, the solution declared optimal is actually a suboptimal solution. The value of objective at suboptimal solutions differ by optimality tolerance (i.e., are very small). However, the distance between suboptimal solutions (including the (unknown) "true" optimal point, which is at a vertex of the flat area of suboptimal solutions) is often large, both for primal and dual variables.
@marek-iiasa: Thank you for your thorough comments. I recently run a model (MESSAGEix-Material-Transport) with the same 1.5C equivalent carbon constraints imposed each period, but with different LP algorithms (Barrier and dual-simplex. Barrier in 5 min and dual-simplex in 12 hours!) to obtain the solution. I observed that the carbon prices (PRICE_EMISSION) were the same, while some dual variables (PRICE_COMMODITY) and primal variables (ACT) were different between two solutions. This is just for your information.
Unfortunately, this will have to be postponed until we find time for a proper cleanup.
When applying a cumulative budget (as parameter
bound_emission
), "PRICE_EMISSION" is calculated inmodel_solve.gms
. The resultingPRICE_EMISSION
, if applied to a scenario using the parametertax_emission
, should yield the same results as the cumulative budget constraint. This is not the case. The issue is that the objective-function, which is the sum over the "COST_NODAL", which in turn incorporates the "tax_emission", is multiplied only by the parameterdf_period
in contrast toPRICE_EMISSION
which is multiplied bydf_year
. The resulting marginals of the equationEMISSION_EQUIVALENCE
of the two approaches shows a difference when switching from 5 to 10-year time steps (in the case of the MESSAGEix-GLOBIOM model), which could indicate that there is a problem when calculating the price when the duration-period changes. Specifically, the calculation ofPRICE_EMISSION
, the marginals from the GAMS solution, for a cumulative budget, are rescaled todf_year
anddf_period
of 2015 (based on the parametermap_first_period
forcumulative
), then multiplied bydf_year
of each model year. Possibly the issue is the calculation ofdf_year
is not consistent when duration period changes, in this case from 5-year to 10-year. In the Excel file,df_year
in each year is calculated based on the previous period. As far as the period length of two consecutive periods is the same, this works well. But when transitioning from 5-year to 10-year, the calculation yields an issue. The solution to this can be to add a correction factor, which would divide the existing calculateddf_year
by what it could be if the same period length was applied.The attached xlsx-file has examples of how a correction factor can be derived and applied to the "PRICE_EMISSION" price trajectoy in order to yield the same results. Calculate_price_emission_1.xlsx