Closed malb closed 13 years ago
Attachment: mip_interface_changes.patch.gz
The attached patch makes the necessary changes to the GenericBackend
and the GLPK
class. Coin
and CPLEX
are not done yet.
And here are the modifications to CPLEX and Coin. The doctests pass, plus all the methods of Graph/Digraph/GenericGraph pass with both GLPK, Coin and CPLEX.
I agreed with you patch almost everywhere ! The only modification I made is the "constraints" argument you had placed in the add_constraint method. I really didn't get why you picked this name. I made it "coefficients", as the pairs are actually the coefficient of the sparse matrix, but of course we can discuss it during the review :-)
Nathann
Attachment: trac_10431-part2.patch.gz
Changed author from Martin Albrecht to Martin Albrecht, Nathann Cohen
Description changed:
---
+++
@@ -5,3 +5,8 @@
* change :func:`add_linear_constraint` to accept any iterable of the form`` [(c,v) ...]``
* ``min`` and ``max`` should be lower bound (or ``lb``) and upper bound (or ``ub``) to conform to MIP conventions
* allow parameters in :func:`add_variable`
+
+The patches are to be applied in this order:
+
+* mip_interface_changes.patch
+* trac_10431-part2.patch
Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review? I can review your's then?
I noticed two more interface "issues" we may decide to "fix":
p.row()
returns two lists (indices,coeffs)
right now, we could change it to return [(v1,c1),...,(vn,cn)]
.obj
parameter which contains the coefficient of the new variable in the objective function?Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.
Sure, your naming makes much more sense. Are you reviewing mine first, or is your patch your review?
Well, I checked yours while working on it and fixed this "constraints->coefficients" inside of my patch ... If you review mine, both files will have be read by two sets of eyes, so I guess it would count as a valid review :-)
Nathann
Replying to @malb:
I noticed two more interface "issues" we may decide to "fix":
p.row()
returns two lists(indices,coeffs)
right now, we could change it to return[(v1,c1),...,(vn,cn)]
.- We could allow an optional
obj
parameter which contains the coefficient of the new variable in the objective function?Neither of those changes I view as mandatory, just throwing the idea out there to see what others think.
Though they sound right... :-)
Nathann
One more thing: you assume in your interface that one can change column and row names after those have been created. However, this isn't easy in SCIP (I could possibly hack around it by hacking around the API). Can we change that, e.g. by adding name as an optional parameter to add_variable() and add_linear_constraint()?
Well, if that's how SCIP's interface is written, I see no way around that .... :-/
I just hope all these optional parameters won't hurt the interface's efficiency. I am eager to have a working stable version of it in Sage, that would let me rebase other patches that have been made invalid because of the work we are doing here. I may spend some time trying to do some performance analysis on the interface afterwards :-D
I'm already sick of these names... They are totally useless for all practical purposes, their only aim is to produce different results in p.show() (which doesn't even use them for the moment) or when writing the LP to files, which is totally orthogonal to actually solving them. Besides, each solver has a different way of managing them, especially Coin. I spent nights just fighting with it, for nothing in the end (oh yes.. it compiles, and there was no regression since the first version of LP.. Who the hell cares ?)
Just to know : are you yourself using names for variables and constraints ?
Nathann
P.S. : Oh, yes.. Let's add this optional argument if there is no way around. If at some point we find a trick, it won't be as hard to remove an optional argument from an interface as any other, as most of the method calls don't include it.
So in the end
I usually set them (e.g. when converting from polynomial systems), but I guess I'm not really using them usually.
So what are they useful for in SCIP ? Can it produce outputs at .lp or .mpw files ?
Nathann
Yep, and other formats. You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.
Replying to @malb:
Yep, and other formats.
Nice !!
You can also query a variable (which is a C struct) for its name in your code. For example, you can search for variables by name.
Same in GLPK. For the moment, I do not do any work on the matrix once it has been defined. Or rather, I only add more information, but never remove/read anything from it O_o
Nathann
Okay, what remains to be done then:
Anything I missed?
Oh... Hem.. Actually, if we are to actually drop these methods... -_-
I can not stand these names... Here is the problem : right now, there are no names when you add variables or constraints in MixedIntegerLinearProgram. When p.show() is called, the names are filled with something which makes them at least distinct and recognizable... But if we want to drop these methods, then p.show() has to be rewritten too.. -_-
Nathann
I would suggest an optional parameter "name=None" for add_variable()?
Replying to @malb:
I would suggest an optional parameter "name=None" for add_variable()?
If you need it for SCIP... :-/
Attachment: trac_10431-part3.patch.gz
Description changed:
---
+++
@@ -10,3 +10,4 @@
* mip_interface_changes.patch
* trac_10431-part2.patch
+* trac_10431-part3.patch
I did the changes below for the GenericBackend and GLPK. Nathann, can you do Coin and CPLEX? I'll update my SCIP interface to see whether now I can pass all doctests with it?
Hello !!!
Martin, the point is that I do not know how to get rid of setting names through col_name and row_name for the moment ! It would mean show() would have to be rewritten, and I do not know how to do it nicely ! :-/
If you have any idea...
Nathann
My best bet would be to keep the name argument in row/col_name, and have SCIP ignore it when it is given or raise an exception. It is already the case for Coin which does not accept names for everything. We could then define a name when the variable/constraint is added (which would work for SCIP), or later on (which wouldn't for SCIP)..
Then again, dropping everything name-related may be the best idea in the end. It's totally useless, it adds arguments/methods everywhere, and no two solvers handle it the same way. What do you think of it ?
Nathann
Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?
Replying to @malb:
Did you check my patch? For some reason p.show() still works, or is the doctest not enough to ensure this?
Well, if your doctests pass then I guess the docstring is not enough. Anyway, for the moment the variable's names are just "cached" inside of MixedIntegerLinearProgram, because I thought stupid to call routines to updates the names unless someone wants to show() it or to export it to a file (which one never does when actually trying to solve them). There is a method named update_variable_name, which you commented in your patch, which does just that : before print show() or exporting the result, it updates the variable's names. This is the only way for the variables to have a name, and as it is usually done after the LP has been generated, it has to use the set_col_name commands. I remember having written somewhere that it was broken at some point, but I am never too eager to deal with this name mess.
Oh, of course everything works fine if you comment #update_variable_names. The .show() method will just print x_1, x_2, etc and the .lp and .mps files will have the same kind of names. I do not think it is that awful, and removing all this name-related stuff from the class will just make it easier to work with.
I mean, it would have been nice to be able to deal with names, but :
It looks like a lot of work for something we're not even sure we can use...
Nathann
So what about:
Replying to @malb:
So what about:
- If the user wants to use names then (s)he has to provide them at creation time of variables or constraints.
- Solvers which don't support them, simply ignore them.
- In show, if there is a custom name set (queried by backend.col_name() and backend.row_name()) then it is used for printing
- When writing lp we do the same?
.....
Sounds logic enough :-p
It just exposes I had a stupid idea when I wrote this set_row_name in the first place. And I guess it's only fitting that I clean my own mess. Please comment the code from mip.py you need to make your patch run and pass doctests, and I'll change what is needed there when dealing with Coin and CPLEX !
Nathann
part3 passes doctests for me. However, so far it simply ignores names.
PS: Can you give me some examples which take about 1 minute to compute with GLPK from the graph module? I want to test the performance of the SCIP solver for those.
Oh ! To compare performances, you could just try a sage -t on graph.py digraph.py and generic_graph.py.
I guess the performances would then depend on the kind of problem : all the solvers I tried until now are just awful for the minor method :
sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK")
CPU times: user 10.59 s, sys: 0.01 s, total: 10.60 s
Wall time: 10.62 s
{0: [(1, 3), (1, 2), (3, 3), (4, 4), (1, 4), (2, 3), (3, 4)], 1: [(2, 1), (2, 2), (3, 2), (4, 2), (2, 0)], 2: [(0, 0), (1, 0), (0, 1), (0, 2)], 3: [(1, 1)]}
The methods raises an exception when 4 becomes 5, which is natural as the problem has no solution, but it takes one lifetime to get this answer from GLPK. If SCIP can get it quickly, that's just a miracle :-D
Another problem of the same kind :
n = 7
graphs.Grid2dGraph(n,n).disjoint_routed_paths([((0,0), (n-1,n-1)), ((0, n-1), (n-1, 0))])
This already takes 10 seconds with CPLEX. Try n=8 if you are patient !
The answer is also an infeasibility, and the solvers are having a hard time proving it.. But I guess it just comes from my bad LP formulation ^^;
A different type of LP formulation, on random graphs :
from sage.graphs.graph_coloring import edge_coloring
%timeit edge_coloring(graphs.RandomGNP(30,.3), solver = "GLPK", verbose = 2)
5 loops, best of 3: 4.47 s per loop
Nathann
On my machine:
sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "GLPK", verbose=0)
CPU times: user 2.81 s, sys: 0.00 s, total: 2.81 s
Wall time: 2.82 s
{0: [(1, 3), (1, 2), (1, 1), (2, 3), (2, 4)], 1: [(3, 0), (1, 0), (3, 1), (2, 0)], 2: [(2, 1)], 3: [(4, 4), (2, 2), (3, 2), (4, 2), (4, 3), (3, 4)]}
It seems there's still a bug in my SCIP interface:
sage: %time graphs.Grid2dGraph(5,5).minor(graphs.CompleteGraph(4), solver = "SCIP", verbose=0)
CPU times: user 1.88 s, sys: 0.04 s, total: 1.92 s
Wall time: 1.94 s
{0: [(0, 0), (0, 1)], 1: [(1, 1)], 2: [(2, 1), (2, 2), (1, 0), (2, 0)], 3: [(1, 2), (0, 2)]}
Replying to @malb:
It seems there's still a bug in my SCIP interface:
Why so ? The answer is not unique ! :-)
Nathann
That's a critical piece of information I missed :)
Plus CPLEX, plus Coin ! :-)
Nathann
Description changed:
---
+++
@@ -11,3 +11,4 @@
* mip_interface_changes.patch
* trac_10431-part2.patch
* trac_10431-part3.patch
+* trac_10431-part4.patch
The patch looks good, I'm running long doctests on sage.math now. Can I assume you ran doctests for CPLEX and Coin (i.e. with the optional packages installed)? I think the only remaining issue is:
sage: p = MixedIntegerLinearProgram()
sage: p.new_variable(name="foo")
MIPVariable of dimension 1.
sage: x = _
sage: p.add_constraint(0 <= x[1] <= 2)
sage: p.show()
Maximization:
Constraints:
x_0 <= 2.0
Variables:
x_0 is a real variable (min=0.0, max=+oo)
For some reason MIPVariable ignores the optional name parameter and hardcodes "x". Is there a reason for this?
Is there a reason for this?
Not really.
Technically, you will find this 'x' inside of the __repr__
method in LinearFunction, which does not care at all whether the variable has a name or not (it does not even know the MIPVariable object itself, just its id in the LP, and the MIPVariable itself does not know its name (only the LP does). All of is has to be rewritten, it is ugly and inefficient code, but would like to find a way to deal with this that does not make every LP waste time on totally useless name-related operations.
I still think we should just remove anything in these classes which is name-related for the same eternal reasons. The code is ugly, hard to understand, it slows the class down, and it is totally useless for solving LP anyway :-/
Nathann
Oh, and I of course checked all the patches on CPLEX and Coin each time. All tests do not pass, because of old bugs :
In the end, the remaining bugs (and they have been there for quite a while, but I did not really worry....) are all about displaying stuff. One is about numerical noise on a very specific case (Coin + Flows. One also has to know that by default Sage uses a Cython implementation of the Ford-Fulkerson algorithm to compute a flow, and not the LP unless asked otherwise -- I made the docstring of Graph.flow mention this noise).
Short of this, all tests pass in graph.py, digraph.py and generic_graph.py. So all the methods there do not see the difference between GLPK, Coin or CPLEX (except for the time spent testing the docstrings) ;-)
Would it be possible to address them in a future ticket ? And could I please have your thoughts about removing all these names from the class ? :-)
Nathann
Would it be possible to address them in a future ticket ?
Agreed.
And could I please have your thoughts about removing all these names from the class ? :-)
I'm unsure about this. But it is certainly undesirable to allow "name" as a parameter in new_variable() and then to ignore it. My main motivation for keeping names around would be to make debugging easier: I can check whether my mixed integer program looks like I want it to look without getting confused about variable name matching.
I'm unsure about this. But it is certainly undesirable to allow "name" as a parameter in new_variable() and then to ignore it. My main motivation for keeping names around would be to make debugging easier: I can check whether my mixed integer program looks like I want it to look without getting confused about variable name matching.
It can be very useful like that for sure.... But I really can not stand that the performances of a solver could somehow suffer because of something which is not used 99% of the time : when the LP are known to work. I really hope this could be fixed soon :-/
Nathann
How does performance suffer exactly? Names aren't used except for showing and writing lp files?
names aren't used except in these cases, but when the mip.py file will have been rewritten any creation of variable will mean storing+concatenating names, any call to add_variables/constraints already creates variables that are to receive potential names. I still do not know the loss between calls of cdef methods and cpdef, and one of the reasons why we have so many optional arguments is to deal with names. And so much in the backends is about names (I can not stand to look at code when I know it is never used. I immediately want to remove it)
I mean, all the LP interface does is create constraints and variables. When you add name considerations to all of these things, you just add time to earn nothing. I will have fun profiling all this when it will be in to check what we could earn by removing them or writing it more smartly. Perhaps I'm just complaining for nothing as usual... Even though cutting the code by half seems tempting just for its sake :-D
Nathann
I don't care much about writing LP files but printing of MIPVariables (p.show()) should allow custom names. I'm not saying it should be supported in the backends but the frontend should. It's just the level of convenience users will expect.
Here is another patch to fix the .show method and the names in general. Even the .lp files exported through Sage bear the good names !
I also fixed the CPLEX problem with infinite bounds, so that all test pass with CPLEX too !
Nathann
You accidentally hardcoded GLPK it seems in generic_backend.pyx?
solver = "GLPK"
Gloops.... Thank you for spotting it, this would have been nasty ^^;
I use that line to test LP with GLPK even though I have CPLEX installed....
Patch updated !
Nathann
Do doctests sill pass with CPLEX?
Of course of course !! I always do a last check with GLPK when I am done with the patch, as it is the "most important" solver for Sage :-)
Nathann
Reviewer: Martin Albrecht, Nathann Cohen
Sage 4.6.1 will contain a new set of backend classes for mixed integer programming, which will make it easier to write interfaces for other solvers. There has been some off-list discussion about this interface and the follow changes were agreed upon:
add_linear_constraint
should allowlb
andub
instead of direction and one bound, it's more expressive.add_variable
should return the index of the newly created variable instead of the next index.add_linear_constraint
to accept any iterable of the form[(c,v) ...]
min
andmax
should be lower bound (orlb
) and upper bound (orub
) to conform to MIP conventionsadd_variable
The patches are to be applied in this order:
CC: @nathanncohen
Component: linear programming
Author: Martin Albrecht, Nathann Cohen
Reviewer: Martin Albrecht, Nathann Cohen
Merged: sage-4.6.2.alpha3
Issue created by migration from https://trac.sagemath.org/ticket/10341