scipopt / papilo

Parallel Presolve for Integer and Linear Optimization
GNU Lesser General Public License v3.0
64 stars 17 forks source link

Correct way to handle dual solution when using from command line #50

Closed sstroemer closed 5 months ago

sstroemer commented 5 months ago

What is the correct way to account for an available dual solution when manually solving the presolved instance? I'm writing a .mps file, calling presolve, and then solve the resulting problem (using JuMP -> HiGHS). I then write a .sol file with the extracted (primal) values of all variables

x      0.3
...
z      -1.2

and call postsolve on that, which gives me

=obj=          1234.00
x              ...              obj(0.0)
...

I assume the obj(...) is "only" the objective coefficient (based on an information from SCIP: (obj: <objective coefficient of variable>)). I've provided parameters using -p lp_presolvers_with_basis.set.

sstroemer commented 5 months ago

Ok, I've figured out that I should be able to pass those using

... --dual-reduced-solution tmp/mymodel.dual.sol --costs-reduced-solution tmp/mymodel.reducedcost.sol

but I'm unsure about the file format (because one similar to the primal solution does not seem to work, resulting in "Postsolve does not contain information about dual solution. Only original primal solution reconstructed.").


Note: For anyone stumbling on this, the file format seems to be the same for "dual" and "reducedcost", the error seems to come from presolve.

sstroemer commented 5 months ago

Narrowed it down to something weird going on when calling presolve ... If I do

presolve -f test.mps -v test.postsolve -r test_presolved.mps --parameter-settings papilo.set

then the postsolve fails.

Instead doing

solve -f test.mps -v test.postsolve -r test_presolved.mps --parameter-settings papilo.set

then solving the reduced .mps, leads to postsolve working. It seems like the presolve is somehow not resulting in a proper setting of PostsolveType::kFull?


To prevent the overhead of running SoPlex, passing -x soplex.set containing just real:timelimit = 0 can be used to immediately abort after the presolve, which essentially means that one can use solve to only run the presolve step.

alexhoen commented 5 months ago

Thanks for your analysis.

It makes sense to allow storing presolve-reductions even if PaPILO is built without any solver. I fixed this, so now you should be able to get the desired behavior with the presolve command. (may require some time till the mirroring the repository is triggered.)

To make it a little bit more verbose I added a small sentence to see if dual-postsolve is activated during presolving:

starting presolve of problem ../../check/instances/MIP/flugpl.mps with dual-postsolve de-activated

If dual-postsolve is activated the "de-" is missing.

For the format of the solution file is the "standard" one used in the MIP-Community:

It is just two columns with variable name and value.

If you have any further questions don't hesitate to ask

sstroemer commented 5 months ago

Hi Alex, thanks for the (amazingly fast) fix! I'll try that as soon as it lands in the public repo (and I've managed to build it - have been using SCIP_PaPILO_jll until now, which is behind, currently df30ae49, anyways).


One note regarding "For the format of the solution file is the "standard" one used in the MIP-Community":

As written before I got it to work by guessing (based on the README and the information I found from SCIP), but I - just my personal opinion though! - would strongly disagree with .sol being "standard" in any actually "usable" way:

Note: Yes of course, I know .sol being really wide-spread, especially in benchmarks and the like (that was the only reason I managed to properly guess everything)! Didn't intend that my reply came across as some sort of critique, more as in a note that from a non-OR dev's perspective it may feel under-specified.


Maybe as an idea to make it easier for "others" in the future, one may insert a small description of the format:

## `.sol` file format

We make use of the `.sol` file format (as supported by solvers like [SCIP](https://www.scipopt.org/doc/html/reader__sol_8h.php), or [Gurobi](https://www.gurobi.com/documentation/current/refman/sol_format.html), and used for example in [MIPLIB](https://miplib.zib.de/)). For the use with `PaPILO` it:
- is a tab-separated file, (basically a TSV with the extension `.sol`), containing up to three columns
- contains information about a solution (obtained from some solver) but not the original problem
- makes use of the same variable and constraint names as an initially supplied `.mps` file
- is not bound to a specific number type (`obj(0)` is treated and understood in the same way as `obj(0.00000000000000)`)
- can contain primal or dual solution values, but not in the same file

Each row can either be an "auxiliary" information (e.g., the objective value) or give the solution for a specific variable/constraint.

### Row types
The following types of rows are allowed:
- starting with `#`: indicates a comment line and will be ignored (in-line comments are not allowed)
- containing `=obj=` in the first column, indicates that the problems objective value is given in the second column
- containing the name of a variable or constraint in the first column, indicating that a respective solution value is given in the second column, with the third column optionally - for variables - containing `obj(...)` with `...` being any integer or floating-point number (as documented by [SCIP](https://www.scipopt.org/doc/html/reader__sol_8h.php), the latter contains the  _"objective coefficient of variable"_

### Examples
The following gives a few short examples to better visualize the file format.

#### Primal solution
This may contain a primal solution that you may pass to `PaPILO`'s `postsolve` command:
```txt
x    0.0
y    0.12508
z    -8815

Note that you should provide a proper primal result for all variables contained in the reduced .mps that PaPILO generated. Giving an objective value is not mandatory, PaPILO will derive that automatically, but you may want to check out the --reference-objective (-o) flag to pass an objective value obtained from your solver as reference for validation (see --help for more information).

Output from PaPILO: After running postsolve (or if you are using solve in the first place), you will get a similar .sol file from PaPILO, with the addition that it will always contain =obj= before any variable values, and that it applies some column alignment to make the file more readable, which extends the "tab separation" into an undefined amount of spaces used to align columns, where reasonable.

Dual solution

Important: To allow PaPILO to consider a dual solution in postsolve, you need to properly configure it during the presolve step (otherwise it may apply presolve-algorithms that are - currently - not "reversable" during postsolve). Consult lp_presolvers_with_basis.set (or ...without_basis.set and pass a proper set of parameters using the --parameter-settings (-p) flag of presolve. You should see a information log containing ... with dual-postsolve activated when running PaPILO. If it states ... with dual-postsolve de-activated your configuration is not properly set.

Dual solutions can be passed in a similar fashion, but are split up into results related to variables (sometimes also called "reduced costs") and to constraints (sometimes also called "shadow prices"). For a full handling of all available dual results, you therefore need to pass two files, with the same format, using the parameters --dual-reduced-solution (constraints) and --costs-reduced-solution (variables).

For example, dual results linked to constraints of the original problem (the values of the associated dual variables) may then look as .sol file like:

constr_1    -0.0
constr_sum_xy    10.0

Note: This is based on the convention that "reduced costs" are the values of the dual variables associated with the bounds of a variable (primal variables themselves do not have a "dual solution"). If a variable has an upper and lower bound, this refers to (if existing) the bound-constraint that is actually binding. If none is binding, the reduced cost is zero.

Note that PaPILO will include the leading =obj= row in both resulting dual solution .sol files ("dual" and "cost").

sstroemer commented 5 months ago

Note regarding my previous comment:

Thanks for the support!

(PS: Small hi-jacking of this issue, what would be the best way to read up on the problems related to sparsify/substitution/componentdetection and dual-postsolve?)

alexhoen commented 5 months ago

You might be right. If you working with a certain format it becomes "standard" within time.

Awesome that you have written down the specification. Do you want to open a merge request in the README? I would then revise it and then merge it!

alexhoen commented 5 months ago

I revised the your log slightly and added also a section for the basis format, which should help also.

Thanks a lot. If you do not have further remarks I will close the tickets within the next days

sstroemer commented 5 months ago

Perfect - and sorry that I didn't have the time to do a proper PR until now (long weekend just coming up)!