atoptima / Coluna.jl

Branch-and-Price-and-Cut in Julia
https://www.atoptima.com
Other
192 stars 43 forks source link

Broken MIP support #550

Open cs-sid opened 3 years ago

cs-sid commented 3 years ago

Short Description Coluna's MIP support is broken. If you submit a model with continuous variables, Coluna in many cases gives a wrong solution, or labels the problem as infeasible (despite a valid solution existing).

Environment Coluna: Version 0.3.9 | 2021-05-15 | https://github.com/atoptima/Coluna.jl OS: Windows 10

Example and Details The following file contains an example that shows how coluna handles problems with continuous variables. issue.txt

In the example, a simple min cost flow problem is setup that has an obvious solution. The problem is setup and solved in the function solveFlowModel(f1isInteger, coluna), which takes 2 parameters. If f1isInteger=true then it sets up a MIP model where one variable (f[1]) must be integer, if f1isInteger=false then all variables are continuous. Of course in the latter case one could simply use the Simplex method, but the model of Coluna should be general enough to handle MIP and purely continuous problems. Also it helps to analyse and understand the issue. The second parameter coluna is the coluna solver that the model shall use. The file provides two methods to setup the solver, first colunaOptimizerForTreeSearch(useModifiedHeuristic) that uses the TreeSearchAlgorithm. The conquer algorithm in TreeSearch (ColCutGenConquer) uses a heuristic. If useModifiedHeuristic=false, then the default heuristic is used (SolveIpForm with enforcing integrality), if useModifiedHeuristic=true it uses the SolveIpForm as heuristic without enforcing integrality. The second method to setup the solver is colunaOptimizerForColumnGeneration, which uses the ColumnGeneration algorithm directly.

The example min cost flow problem contains 3 flow variables which we call f[1], f[2] and f[3]. In the Julia-code they actually are defined as f[1,1], f[2,1], f[3,1], where the second dimension is the axis with only one entry. The latter is done so that the BlockDecomposition creates a subproblem for pricing. The resulting setup is that the subproblem/pricing problem is the min-cost-flow problem without the capacity constraints and the master problem keeps the capacity constraints.

Purely continuous Model

Let us first consider the purely continuous model. In this situation the correct solution is f[1] = 8.5, f[2] = f[3] = 1.6 and cost = 160.

Standard tree search

Using the tree search with the default heuristic gives the following, invalid solution.

solveFlowModel(false, colunaOptimizerForTreeSearch(false))

...

termination: OPTIMAL
primal_status: FEASIBLE_POINT

cost = 1010.0   (expected: 160)
flows:
   f[1] = 0.0  (expected: 8.5)
   f[2] = 10.1  (expected: 1.6)
   f[3] = 10.1  (expected: 1.6)

Observe that coluna reports that this is an optimal solution, which it in fact is not. So what happened? I debuged into the code to analyse the issue. All following and later observations made by me in the code are based on my, quite limited, understanding of the Coluna code. It might therefore be that the analysis is incomplete or I might have understood things wrong (more comments in the code would help). When debugging into the code, it showed that the column generation algorithm generates 2 columns in 3 iterations. The first column is f[1,1] = 10.1, f[2,1] = f[3,1] = 0, the second one is f[1,1] = 0, f[2,1] = f[3,1] = 10.1. The correct solution therfore corresponds to (8.5/10.1)first_column + (1.6/10.1)second_column. This is exactly what the column generation algorithm also finds. However, this solution is non-integer. The issue seems to be that in the column generation algorithm a valid solution for the model needs to be completely integer. This oberservation might be also true throughout Coluna, but I am not sure. In the code a "valid problem solution" is called ip_primal_sol or variations of that. This indicates the assumption of a integer solution, but it is unclear, at least to me, whether this assumption applies everywhere in the code. In the column generation algorithm the assumption that a valid solution must be purely integer is explicit in line 663 in colgen.jl:

                 if isinteger(proj_sol) && isbetter(lp_bound, get_ip_primal_bound(cg_optstate))

The code explicitly checks for whether the solution is a purely integer solution. As this is not the case for our problem, the column generation will finish without a solution to return. This can be seen by simply running the column generation algorithm directly:

solveFlowModel(false, colunaOptimizerForColumnGeneration())

...

termination: OPTIMAL
primal_status: NO_SOLUTION

cost = Inf   (expected: 160)
flows:
   f[1] = NaN  (expected: 8.5)
   f[2] = NaN  (expected: 1.6)
   f[3] = NaN  (expected: 1.6)

So how did we end up with the wrong solution above if column generation does not return a solution? Well, the conquer algorithm runs a heuristic after the column generation. That heuristic uses SolveIpForm with enforcing integrality. But it runs that not just on the original model, but on the generated columns as well. It is unclear to me whether SolveIpForm works on both the generated columns and the model variables or only on the columns, but that is not important. The important thing is that the solver then looks for a single column that provides a feasible solution to the problem (single due to integrality constraint). As the first column does not provide a feasible solution to the problem (the arc for f[1] has capacity 8.5), only the second column is feasible (no capacity on the other arcs). Thus the second column provides a feasible solution, and is thus returned and falsely labeled as optimal solution.

If one introduces a capacity of 5 on the arc for f[2], then no single column is by itself feasible and Coluna would return that the problem is infeasible and the flows would be NaN.

Standard tree search with heuristic not enforcing integrality

Ok, so if the integrality constraint on the columns in the SolveIpForm heuristic leads to the wrong solution, why not just remove it? Let us see:

solveFlowModel(false, colunaOptimizerForTreeSearch(true))

...

termination: OPTIMAL
primal_status: FEASIBLE_POINT

cost = 159.99999999999991   (expected: 160)
flows:
   f[1] = 8.500000016  (expected: 8.5)
   f[2] = 1.5999999839999999  (expected: 1.6)
   f[3] = 1.5999999839999999  (expected: 1.6)

Yeah, that works (within some floating point precision). The SolveIpForm now can correctly return a linear combination of the two columns. Ok fine, but what if we now introduce integer variables (remember, so far all were continuous).

MIP Model (f[1] is integer)

Here the correct solution is f[1] = 8, f[2] = f[3] = 2.1 and cost = 210.

Standard tree search

Well, let's see:

solveFlowModel(true, colunaOptimizerForTreeSearch(false))

...

termination: OPTIMAL
primal_status: FEASIBLE_POINT

cost = 1010.0   (expected: 210)
flows:
   f[1] = 0.0  (expected: 8)
   f[2] = 10.1  (expected: 2.1)
   f[3] = 10.1  (expected: 2.1) 

Again, we get the wrong solution, in fact exactly the same solution as in the continuous case. The column generation here also returns 2 columns. The first is f[1] = 10, f[2] = f[3] = 0.1 and the second is f[1] = 0, f[2] = f[3] = 10.1. The first column is slightly different due to the integrality constraint on f[1], but the second column is exactly the same as in the purely continuous model. The problem described in the purely continuous case is also occuring here, the column generation returns no solution and the final SolveIpForm returns the only feasible single column as wrong optimal solution.

Well, that was expected after what we observed in the purely continuous case, so why not just solve it by removing the integrality constraint on the heuristic (SolveIpForm)?

Standard tree search with heuristic not enforcing integrality

solveFlowModel(true, colunaOptimizerForTreeSearch(true))

...

termination: OPTIMAL
primal_status: FEASIBLE_POINT

cost = 160.00000000000003   (expected: 210)
flows:
   f[1] = 8.5  (expected: 8)
   f[2] = 1.5999999999999999  (expected: 2.1)
   f[3] = 1.5999999999999999  (expected: 2.1)

Well, the heuristic returned exactly the same solution as in the purely continuous case, with non-integer flow in f[1]. This is due to that we removed any integer requirement in the heuristic, and the column generation returned no solution.

Consequences

The support for continuous variables needs to be rectified in Coluna. At least the column generation algorithm needs to be modified to remove the assumption of purely integer solutions. It should also be checked whether this assumption exists in other places in Coluna (the check for isinteger exists also other places, e.g. solvelpform.jl, branching and benders.jl).

Maybe one can modify the name from "ip_primal_sol" to "mip_primal_sol" or "feasible_primal_problemsol" or similar to explicitly show that it might not_ be an ip (purely integer) solution.

Moreover one needs to ensure that Coluna does not return suboptimal solutions that are flagged as optimal, labels the problem wrongly as infeasible, or returns solutions where integer variables have non-integer values.

guimarqu commented 3 years ago

Hi,

Thank you very much for this detailed description of the bug.

Of course in the latter case one could simply use the Simplex method, but the model of Coluna should be general enough to handle MIP and purely continuous problems.

I agree it should work for both cases.

(more comments in the code would help).

We'll try to do better.

This is exactly what the column generation algorithm also finds. However, this solution is non-integer. The issue seems to be that in the column generation algorithm a valid solution for the model needs to be completely integer. This oberservation might be also true throughout Coluna, but I am not sure.

                 if isinteger(proj_sol) && isbetter(lp_bound, get_ip_primal_bound(cg_optstate))

If we take a look at the code of is_integer, there is a problem. We ask integrality of continuous variables... https://github.com/atoptima/Coluna.jl/blob/3ef9f738b00584bfb0c527381f7c926c6111930d/src/MathProg/solutions.jl#L21-L28

I think your example doesn't work because of the two lines that should be uncommented.

In the code a "valid problem solution" is called ip_primal_sol or variations of that. This indicates the assumption of a integer solution, but it is unclear, at least to me, whether this assumption applies everywhere in the code.

This assumption indeed applies everywhere. However, we consider a solution as integer if all integer variables have integer values. So when you are solving a LP, all solutions should be considered as integer.

It is unclear to me whether SolveIpForm works on both the generated columns and the model variables or only on the columns.

Only on generated columns and pure master variables (original variables that belong to the master).

Standard tree search with heuristic not enforcing integrality

Yeah, that works (within some floating point precision). The SolveIpForm now can correctly return a linear combination of the two columns. Ok fine, but what if we now introduce integer variables (remember, so far all were continuous).

There is still a problem. In the restricted master heuristic, we should not enforce the integrality of columns from subproblems that contain only continuous variables.

Consequences

The support for continuous variables needs to be rectified in Coluna. At least the column generation algorithm needs to be modified to remove the assumption of purely integer solutions. It should also be checked whether this assumption exists in other places in Coluna (the check for isinteger exists also other places, e.g. solvelpform.jl, branching and benders.jl).

I identified three issues :

Maybe one can modify the name from "ip_primal_sol" to "mip_primal_sol" or "feasible_primal_problemsol" or similar to explicitly show that it might not_ be an ip (purely integer) solution.

We won't change the name but definitely improve docstrings or add comments.

Moreover one needs to ensure that Coluna does not return suboptimal solutions that are flagged as optimal, labels the problem wrongly as infeasible, or returns solutions where integer variables have non-integer values.

How would you do that ?

Can I put your example in the tests to fix the bugs ?

cs-sid commented 3 years ago

Hi,

Thank you very much for the detailed answer. Of course you can put the example into the tests. The other two aspects, integrality on columns for problems with some integer variables and ensuring validity of Coluna are more complex. Let us begin with the more mathematical aspect, the integrality on columns.

Integrality on columns

If I remember, and understood, Dantzig-Wolfe decomposition correctly, it is based on applying the Weyl-Minkowski theorems (or is it Minkowski-Weyl?) to the "easy constraints" of the original problem. In Coluna those "easy constraints" are, according to my understanding, defined by the use of the axis in the constraint name. This results in replacing the original variables y by Weyl_Minkowski where x_q are the extreme points of the polytope defined by the easy constraints, and x_r its extreme rays (let us for simplicity assume that there is only one polytope/subproblem). Solving the decomposed problem with column generation leads to basically the columns representing either extreme points or extreme rays. Thus the optimal solution to the problem can, and I guess often will, contain a convex combination of columns generated by the subproblem, namely the ones representing the extreme points. This will also be the case for IP or MIP problems. In the min-cost-flow example of the issue the optimal solution for the MIP version consists of 0.8 first_column + 0.2 second_column. The resulting solution is fulfilling the integer requirement of f[1]. If I understand your suggestion and the consequences in Coluna correctly, it would mean that as soon as there is an integer variable in the problem, all lambda_q above would be required to be integer. But due to the convex combination this leads to the requirement of exactly one such column being in the solution. But is this not exactly the same situation as in the MIP case before? Which leads to the same suboptimal solution from the SolveIpForm in the primal heuristic as described in the issue? If one wants to avoid the problem of the SolveIpForm giving non-integer values for integer variables (as happened in the last example case in the issue), I think a better way would be to add those integer variables as constraints: Add the formula above for those variables and add that this sum/variable must be an integer value. Of course if the column generation algorithm correctly solves the problem with the optimal solution, than having a suboptimal solution from the primal heuristic might not be a problem (but is probably less efficient). Which leads us to the second aspect of the discussion.

Validity of Coluna

I admit that the statement in the issue of need of ensuring validity of Coluna is a bold one. And I do not think that it is possible to add some "magic code" that always and forever will guarantee it. I think it is more a process of identifying possible sources of incorrectness, and for each decide whether it is acceptible or if there should be some actions in place to either detect or avoid it. Let us look at some examples.

Assuming I understand Coluna correctly enough, than the TreeSearchAlgorithm basically provides an exact Branch & Price & Cut algorithm for the problem. Once such algorithm finishes, it provides either the optimal solution or proves infeasibility of the problem. Wrong results here result from bugs in the code. The code is covered by tests, to reduce the number and severity of bugs. Thus I think for this the possiblity of incorrectness is considered enough and the remaining risk acceptible. However, this statement is only true as long as the algorithm is implemented such that it does not contain knowing dangers of invalid results. This is where the primal heuristic comes in. For me the purpose of the primal heuristic in the TreeSearchAlgorithm is a bit unclear, but I guess it is there to fastly find a feasible solution that can provide bounds further down the tree to enable fast and efficient cutting. Therefore, as also indicated by the name heuristic, it is not expected from the primal heuristic to provide optimal solutions or infeasiblity information. In fact, a solution that is feasible with respect to the actual problem and optimal with respect to the columns available in that node, may and probably will not be optimal for the whole problem. Therefore the "optimal solution" information returned from the heuristic does not have a consequence on the solution for the whole problem. What it tells us is that there exists a feasible solution to the whole problem, and thus provides a bound. If now, for some reason, at the same time the column generation algorithm together with the branching and cutting does not find a feasible solution (as is the case in the issue presented above), then it is clear that there must be an error somewhere. On one hand we have at least one feasible solution from the primal heuristic, but on the other hand our exact algorithm did not find a solution, thus "proving" infeasibility. We have two contradictory results, thus the output from Coluna should neither be infeasibility nor optimal solution found (as in the case of the issue), but that an error/bug in the code was detected, that Coluna can not provide any reliable answer to the problem and that a bug issue should be commited to GitHub with the problem as example.

guimarqu commented 3 years ago

Hi,

Integrality on columns

If I understand your suggestion and the consequences in Coluna correctly, it would mean that as soon as there is an integer variable in the problem, all lambda_q above would be required to be integer. But due to the convex combination this leads to the requirement of exactly one such column being in the solution.

Correct but only in the solution returned by the restricted master heuristic. It does not prevent the column generation from finding a feasible solution that uses combination of columns because we check feasibility of the aggregated solution. It means that we project the master solution to the original formulation. Therefore, if the master solution (even with fractional columns) leads to a feasible solution in the original formulation (where all integer variables are integral), then the master solution will be considered as feasible.

But is this not exactly the same situation as in the MIP case before? Which leads to the same suboptimal solution from the SolveIpForm in the primal heuristic as described in the issue?

It is.

If one wants to avoid the problem of the SolveIpForm giving non-integer values for integer variables (as happened in the last example case in the issue), I think a better way would be to add those integer variables as constraints: Add the formula above for those variables and add that this sum/variable must be an integer value.

Yes that was the suggestion of @rrsadykov.

The easy way to implement this mechanism is to create a new constraint for each original variable in the master formulation but I'm afraid that it may decrease performances. As far as I remember, the addition of a column in the master formulation is a bottleneck so we can't afford doing this operation twice.

The second solution that I though this morning is to generate theses constraints in the restricted master algorithm and adding them only in the subsolver. I think that's the way to go.

Of course if the column generation algorithm correctly solves the problem with the optimal solution, than having a suboptimal solution from the primal heuristic might not be a problem (but is probably less efficient).

Yes this is not a bug anymore more a matter of performance / effectiveness.

Validity of Coluna

For me the purpose of the primal heuristic in the TreeSearchAlgorithm is a bit unclear, but I guess it is there to fastly find a feasible solution that can provide bounds further down the tree to enable fast and efficient cutting.

Yes it is.

We have two contradictory results, thus the output from Coluna should neither be infeasibility nor optimal solution found (as in the case of the issue), but that an error/bug in the code was detected, that Coluna can not provide any reliable answer to the problem and that a bug issue should be commited to GitHub with the problem as example.

Ok, it's a good idea.

guimarqu commented 3 years ago

I reopen beause #551 does not address the whole issue.