Objective value from dual feasible solution mismatch getObjValue() in dual simplex

febattista commented 5 months ago

Hi all,

Here I have a small example (a .cpp with some .mps files) in which I'm using CLP via OsiClpSolverInterface to replicate some unexpected behaviors. Here I'm calling initialSolve() but CLP has been set to always go for the dual simplex. Shouldn't this suggest that no matter the state of CLP, the dual solution should be feasible and the objective value should match?

This example is replicating the following behavior: When iteration limit is reached or the LP is proven primal infeasible, the objective value from si.getObjValue() and the one computed from si.getRowPrices(), si.getReducedCosts() doesn't match;

To run the example I would do ./it_limit_bug lp_it_0.mps

Thanks in advance for the support,

Federico

it_limit_bug.zip

jjhforrest commented 5 months ago

si.getObjValue() gives the primal objective value i.e. using column values and objective. This is what people expect and is more intuitive. It would be possible to add a getDualObjvalue - but then you have.

The other point is more interesting. Although the code asks for the dual algorithm, it does not demand it and Clp makes a decision to swap to primal. If you add si.getModelPtr()->setMoreSpecialOptions(si.getModelPtr()->moreSpecialOptions()|256);

then your example succeeds as that stops the code swapping. However setting that option is not totally safe as you can see if you look at comment in ClpSimplex.hpp. I will change code in master so that 8192 will have same effect at the relevant place in dual code as 8192 has correct meaning.

tkralphs commented 5 months ago

Thanks for the clarification, @johnjforrest. Yes, when calling Clp simplex, we understood that asking for dual simplex is only a "hint," but when a dual-feasible advanced basis is loaded, I'm having trouble seeing why primal would be chosen. The bug came up partly because there is one variant of strong branching in SYMPHONY that calls initialSolve() after loading an advanced basis rather than the usual resolve(). This may be stupid and I cannot see why this was done, but it still seems like it should work. But when there is an iteration limit, it actually doesn't work. So does initialSolve() ignore the advanced basis, even if one has been loaded?

What's a bit confusing is that when pre-solving in branching, SYMPHONY has always called resolve() (with a hint to do dual simplex) after loading an advanced basis and then taken the result of si->getObjValue() to be a valid dual bound. This has never caused an issue. It was only the call to initialSolve() that caused this problem to surface.

If there is an iteration limit set and dual simplex is chosen, so that the final basis is dual feasible (and primal infeasible), I would assume that the objective value would correctly correspond to the dual objective value and be a valid dual bound? Is thee any way to verify whether dual was actually used after the fact?

I assume that because SYMPHONY always loads a dual feasible basis and typically calls resolve(), Clp chooses dual, as expected, and everything has "just worked" up until now. Does this seem right?

jjhforrest commented 5 months ago

Part of the problem comes from Osi. When using Clp directly you are much more likely just to call dual or primal. Osi thinks of initialSolve as being done once and it gets worried when the problem looks infeasible and so tries harder to try and get feasible. The modification in master should stop that happening. I can not add getDualObjvalue or getPrimalObjValue to Osi, but I can add them to OsiClp if you want.

tkralphs commented 5 months ago

Yes, dropping into the Clp model class and just directly calling dual is an option for forcing it, but that didn't seem to fix anything. I still don't understand what to expect when Clp terminates on iteration limit. If it is doing dual simplex, I expect to get a feasible dual solution and for si->getObjValue() to return the value of that dual solution.

I know you said that si->getObjValue() returns a primal value, but if it is doing dual simplex, then the value of the infeasible primal solution associated with the dual feasible basis is the same as the value of the dual solution itself, right? And it should be a valid dual bound. The value of the primal and dual solutions coming from a given basis should be the same.

The attached driver shows an example using Clp master, where we are start solving with initial solve, then change bounds on a variable and re-solve with an iteration limit, then change bounds again and re-solve one more time. In that third solve (after the output Solving the second child LP...), the dual solution returned seems to be infeasible and its value does not match what Clp says is the objective value at termination. It is clear we are doing dual simplex because the objective value is increasing in each iteration and all values are valid dual bounds. Here is part of the output:

Clp0102I 37 -1597 In: C59 Out: C34 dj ratio 1 distance 1
Clp0105I Pivot row 229
Clp0101I dirOut 1, dirIn 1, theta 7, out 1, dj 7, alpha -1
Clp0102I 38 -1590 In: C1 Out: C327 dj ratio 1 distance 7
Clp0105I Pivot row 204
Clp0101I dirOut 1, dirIn 1, theta 21, out 2.220446e-16, dj 21, alpha -1
Clp0102I 39 -1576 In: C135 Out: C302 dj ratio 2.220446e-16 distance 21
Clp0105I Pivot row 31
Clp0101I dirOut 1, dirIn 1, theta 1e-18, out 1.8, dj 0, alpha -0.5
Clp0102I 40 -1576 In: C309 Out: C129 dj ratio 1.8 distance 1e-18
Clp0003I Stopped - objective value -1576
CLP: isIterationLimitReached
Dual objective: -1685.4844913151
 getObjValue(): -1576.0000000000
WARNING: Values mismatches!
WARNING: Dual infeasible!

I think we must be missing something. What is it?

example.zip

jjhforrest commented 5 months ago

Will look into it. One thing that came to mind is to think more about using some values of Clp specialOptions_ . Years ago these went in 0x01000000 is Cbc (and in branch and bound) 0x02000000 is in a different branch and bound These should be used to influence behaviour. The Cbc one is used to some extent, but could be used more. The other is not used at present.

jjhforrest commented 5 months ago

By default Clp is not very interested in solution on max iterations. For strong branching it needs objective which it can get easily without bothering about scaling etc. Also using OsiClp the code does a quick reduction of problem as often some variables have been fixed - this makes the code slightly faster. This also makes it more difficult to get good values in some situations. So to be safe - switch this off int clpOptions = si.specialOptions(); si.setSpecialOptions((clpOptions|2048)); // no crunch

I have added a new option so that the user can ask for more to be done on max iterations // 0x08000000 says get good everything on maxits si.getModelPtr()->setSpecialOptions((options|32|0x08000000));

tkralphs commented 5 months ago

Ah yes, we were suspecting something like this, thanks for the fix! We are testing now.

febattista commented 4 months ago

Thanks for your precious help John. Now everything looks fine with dual solutions at iteration limit and CLP seems to go for dual simplex when he's been hinted so. However, during my tests I found out a case where CLP seems to correctly start from dual, but then at some point it switches to primal. Here's the rather long (sorry) output from the driver attached in this answer.

Clp0022I Absolute values of scaled rhs range from 5012 to 6509.125, minimum gap 1e+100
Clp0020I Absolute values of scaled objective range from 19 to 986
Clp0021I Absolute values of scaled bounds range from 1 to 1, minimum gap 1
Clp0060I Primal error 0, dual error 0
end getsolution algorithm -1 status -1 npinf 1 sum,relaxed 1904,1904 ndinf 0 sum,relaxed 0,0
Clp0060I Primal error 0, dual error 0
end getsolution algorithm -1 status -1 npinf 1 sum,relaxed 1904,1904 ndinf 0 sum,relaxed 0,0
Clp0006I 0  Obj -8321 Primal inf 1904 (1)
Clp0105I Pivot row 2
Clp0101I dirOut -1, dirIn -1, theta 0.29875986, out -0.44532131, dj -265, alpha -887
Clp0102I 1 -8139.9899 In: C1 Out: R2 dj ratio -0.44532131 distance 0.29875986
Clp0105I Pivot row 0
Clp0101I dirOut -1, dirIn 1, theta 0.3878123, out 0.97883941, dj 139.1398, alpha 358.78129
Clp0102I 2 -8003.7943 In: C12 Out: R0 dj ratio 0.97883941 distance 0.3878123
Clp0105I Pivot row 2
Clp0101I dirOut 1, dirIn -1, theta 721.57915, out -0.17652219, dj -497.85127, alpha 0.68994686
Clp0102I 3 -7915.9125 In: C3 Out: C1 dj ratio -0.17652219 distance 721.57915
Clp0105I Pivot row 0
Clp0060I Primal error 0, dual error 2.2737368e-13
end getsolution algorithm -1 status -3 npinf 1 sum,relaxed 0.000152711,0.000152711 ndinf 0 sum,relaxed 0,0
Clp0105I Pivot row 2
Clp0103I Primal nonlinear change -1e+10 (1)
Clp0060I Primal error 0, dual error 0
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
increasing weight to 5e+10
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 0, dual error 3.8146973e-06
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 0, dual error 3.8146973e-06
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change -1e+30 (1)
Clp0060I Primal error 0, dual error 686
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 0, dual error 2.2737368e-13
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 0, dual error 2.2737368e-13
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0001I Primal infeasible - objective value -7915.9125
Clp1001I Initial range of elements is 1 to 959
Clp1002I Range of elements is 0.033407655 to 29.933259
Clp1002I Range of elements is 0.1900374 to 5.2621221
Clp1003I Final range of elements is 0.15658566 to 6.3862807
Clp0022I Absolute values of scaled rhs range from 5.59375 to 6.8301417, minimum gap 1e+100
Clp0020I Absolute values of scaled objective range from 36 to 22159.739
Clp0021I Absolute values of scaled bounds range from 0.034778672 to 1, minimum gap 0.034778672
Clp0060I Primal error 8.8817842e-16, dual error 1.8189894e-12
end getsolution algorithm -1 status -1 npinf 1 sum,relaxed 1.54668e-05,1.54668e-05 ndinf 0 sum,relaxed 0,0
Clp0060I Primal error 8.8817842e-16, dual error 1.8189894e-12
end getsolution algorithm -1 status -1 npinf 1 sum,relaxed 1.54668e-05,1.54668e-05 ndinf 0 sum,relaxed 0,0
Clp0006I 0  Obj -7915.9125 Primal inf 1.5466799e-05 (1)
Clp0105I Pivot row 2
Clp0103I Primal nonlinear change -1.0712836e+09 (1)
Clp0060I Primal error 8.8817842e-16, dual error 1.9073486e-06
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
increasing weight to 5e+10
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 8.8817842e-16, dual error 7.6293945e-06
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 8.8817842e-16, dual error 7.6293945e-06
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change -1.0712836e+29 (1)
Clp0060I Primal error 8.8817842e-16, dual error 1.4073749e+14
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 8.8817842e-16, dual error 1.8189894e-12
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0103I Primal nonlinear change 0 (1)
Clp0060I Primal error 8.8817842e-16, dual error 1.8189894e-12
end getsolution algorithm 1 status -3 npinf 0 sum,relaxed 0,0 ndinf 0 sum,relaxed 0,0
Clp0001I Primal infeasible - objective value -7915.9125

By focusing on the lines starting with end getsolution algorithm 1/-1, we can clearly see that CLP starts from dual (-1), then at some point switches to primal (1). This LP is infeasible and we are interested to get the dual ray that proves the dual unboundedness. After CLP terminates, by calling si.getDualRays(1, false) we get a nonempty ray for which the Farkas proof I usually use in my implementation fails (i.e. according to it, the LP should be feasible). By compiling CLP with -DPRINT_RAY_METHOD we get:

Infeasibility ray obtained by algorithm primal - direction out -1

In my tests, in most of the cases, the ray is obtained by the dual algorithm and it seems that the Farkas proof works whenever the ray is obtained this way. In principle, we are fine with CLP switching to primal in this specific case, however we need a ray generated by the dual algorithm as this seems what fit our purposes. Is there a way to fix this?

To run the driver, run farkas_proof farkas_proof.mps

driver.zip

jjhforrest commented 4 months ago

Seems to be bug in ClpSimplexPrimal - hopefully fixed now in master.

febattista commented 4 months ago

Thanks for the update John.

Unfortunately, it seems that it does not fix the issue. The ray returned have components with high magnitude (order of 1e+195) which looks quite strange. Is it supposed to be so?

Sorry for the inconvenience and thanks again for your time

febattista commented 4 months ago

Dear John,

The last update you made on Clp on master regarding this issue made the ray vector to have "nice" components (only ones and zeros) but the Farkas proof on that ray fails in my SYMPHONY code. Unfortunately I'm not able to replicate this behavior in a driver. Would you be down to have a look at the issue in SYMPHONY?

Here's how you can replicate the issue:

Do git clone https://github.com/febattista/SYMPHONY.git -b rvf
Do coinbrew build SYMPHONY -b build-bug -p --tests none --enable-debug --enable-sensitivity-analysis ADD_CFLAGS=-DCHECK_FARKAS
cd /build-bug/SYMPHONY/rvf/Example and run make

The command to run the example is ./test_rvf -F <full/path/to>/SYMPHONY/Datasets/KP_p-3_n-10_ins-5.mps I suggest to set a breakpoint at SYMPHONY/src/LP/lp_solver.c@3904 and look at variable ray(the ray returned by Clp) which is a ray of length lp_data->maxn == 3. You should see that ray = {1, 0, 0} and the Farkas proof fails (i.e. ray_times_b <= 0. Then you can set a breakpoint at lp_solver.c@4002 and look at ray to see the result of an auxiliary LP that is used to compute the "correct" (i.e. Farkas proof applies) ray which seems to be ray = {0, 1, 0}.

coin-or / Clp

Objective value from dual feasible solution mismatch getObjValue() in dual simplex #291