Closed ajwagen closed 7 years ago
A simple and flat (PEP 20) approach is to pass the solver as a keyword argument to the function called. Setting globals is not advisable. Also, I do not know what could happen if one runs two threads with separate solver settings. With the keyword argument approach there is no danger in that case.
The only “headache” is that the solver settings have to descend throughout the call graph. However, using global variables to establish a solution environment is not an alternative. The polytope
package needs a redesign (and re-implementation at some places), so this wouldn't be so big a problem. A redesign (but not re-implementation) has occurred on branch overhaul
, but is unmerged because it was quickly applied and adversely affected performance, so it needs tracing when that happened in history.
Regarding passing around a solver environment, I recommend creating a data structure (using a dict
is the simplest solution, in absence of arguments for something more) of solver settings (which solver, tolerance, etc. in the future). This is trivially extensible for future adaptation, without the need to add keyword arguments to all functions every time some variation is needed. It would replace (and contain) the argument abs_tol
that currently propagates throughout the call graph.
A side comment is that global
should always be avoided. In the fragment above, it appears that from foo import hehe as _hehe
and settings['lp_solver'] = 'glpk'
would result in the same functionality (though I need to consider this in more detail in context within polytope.polytope
).
The difficulty in these cases arises because a choice of solver at a global scope constitutes module state. This is implicit. However, if the solver is abstracted by a function within the module, then this state is restricted to the function. Fans of object orientation might want to create a class in this case, but that would only nest things and not simplify anything.
The explicit approach is to pass solver options as described above and have them propagate through the call graph. This happens everywhere in polytope
though, it can be impractical and reduces readability of the entire module.
So, a configuration function as the OP proposes appears to be a good abstraction that offers a compromise between practicality and explicitness (not necessarily with the function body proposed in the OP). The closest thing that comes to mind is matplotlib.use
. Given that this approach is used in matplotlib
, it suggests that it has been found practical. This settles whether a function should be used.
Sidenote: there do exist objects passed around anyway: the polytopes. So, the state "solver choice" could be stored in them. However, then we can have issues with mismatch of solver choices in two different polytopes passed as arguments to the same function, so this approach would create more things for the user to worry about.
There is something that bothers me, and is orthogonal. Implicitly, we change solver from scipy
to cvxopt.glpk
if the latter is found. This does not fix a default, and I fear it will cause problems:
tulip
, there is a fixed default for the solver (omega
) that is always available, and changes should happen explicitly.dd
, you choose which module to import, based on what you know you have installed on your machine. The modules have the same API.tulip.interfaces.omega
initially I wrote it so that it tries to use dd.cudd
, and if that is absent, it uses dd.bdd
. Later, errors went unnoticed because of this implicit switch (I think they escaped testing on my machine, and were uncovered on Travis CI), so I fixed the default to dd.bdd
, and offered dd.cudd
as an option through a function argument.matplotlib
, there is a default backend (from the errors that I see I think that it usually is Qt, unless changed in the user configuration file).cvxopt
makes this choice explicit, by the module that you choose to import.Based on these, I think that it is time to fix scipy
as the default backend, and offer a mechanism as above for explicitly changing the configuration. This will affect tulip
.
The simplest abstraction would probably be to confine the option setting to be shallow. What I mean is this: if you want to change the backend of matplotlib
, you do not import package some.nice.plotting.this.is.a.module.nested.really.deep
and select the backend there. You import matplotlib
solely for the purpose of setting matplotlib.use('Agg')
or something similar. So, we shouldn't nest things, by making this a choice within tulip
. Probably the best would be to let tulip
be oblivious of this choice (for calls that involve polytope
), and have the user define it directly for polytope
.
Of course, this leaves out quadratic programming in tulip.abstract
(which would need its own explicit configuration -- though there is only a single choice currently: cvxopt
, so not much of an option), but, strictly speaking, that is a different solver, somewhere else, solving a different problem (QPs instead of LPs).
One advantage is that this will show up in the tulip
examples, thus quickly making users aware of two things:
The choice of GR(1) solver is already explicit. Note that the state of a "problem instance" is a very useful notion in this case, because it is the only place where these choices fit.
The choice of BDD solver, where there are alternatives, is currently explicit. However, it is nested, because it goes through tulip.interfaces.omega
, instead of being selected directly as the backend of dd
(ala matplotlib
). So, perhaps we should consider changing that part of the interface too.
This issue concerns two aspects:
lpsolve
)The default constitutes global state, but a reasonable default helps users. Changing the default is possible via the variable that defines it (more details below).
A way to change the solver used in the entire module is needed, because otherwise this choice would have to propagate down the entire call graph, which would clutter the source code (see also comment above). A global variable is present for this purpose anyway, in order to record what happened at import time. Reusing this variable is good enough. Simple is better than complex [PEP 20].
The module's default is decided automatically at import time. The best installed solver is selected. This is reasonable, because it frees user code of unnecessary boilerplate. I reject my suggestion above to fix the default solver to scipy
(which is slower).
Python tries to avoid setter functions whenever possible. The OP is implemented by letting the user changing the value of the module variable lp_solver
. If only scipy
is installed there is no other choice. If the user changes lp_solver
, they will just cause an exception later. If both solvers are installed, then lp_solver = 'glpk'
by default. If the user wishes to use scipy
instead, they can do:
from polytope import polytope as _pt
_pt.lp_solver = 'scipy'
This possibility has been documented in the docstring of the module polytope.polytope
, and a hint has been inserted also in the docstring of the function polytope.polytope.lpsolve
(5dc5be67f4905c55f513532f00b1b280ce3c21aa).
The choice of solver is by name, not by True
/False
, so that more than 2 solvers be selectable (in the future), and for readability of user code. So the same amount of information is needed to set the variable lp_solver
as would be needed to pass when calling a setter function. So there is no more cognitive burden placed on the user.
For calls to the solver interface polytope.polytope.lpsolve
, the keyword argument solver
provides an explicit choice that overrides the module's default (b9af8086b0a0324f22f93e3c84e5041e8ea7c831).
The QP solver choice within tulip
(mentioned above) is a separate concern than what is examined here. There is but only one choice there for now. So there is nothing more to be done about that here.
Regarding the default choice, I commented above:
A global variable is present for this purpose anyway, in order to record what happened at import time. Reusing this variable is good enough.
This comment refers to the variable lp_solver
:
Changing this variable does change the default solver. However, the following corner case raises an error:
cvxopt.glpk
polytope
, so lp_solver == 'glpk'
lp_solver = 'scipy'
polytope.lpsolve(..., solver='glpk')
The error will be raised because the module now thinks that the best available solver is scipy
, and that GLPK is unavailable, here:
So changing lp_solver
does change the package's default, but actually prevents direct calls to GLPK via lpsolve
. There is no reason for this limitation. To avoid it, two separate module variables should be used:
set
-valued variable that records all solvers found installed at import timelp_solver
)Setting the default can remain as simple as it is now, with 2 choices, even when more solvers become available:
scipy
is present but GLPK absent, choose scipy
as defaultscipy
and GLPK are present, choose GLPK as defaultAbout switching from a pure Python to an accelerated implementation (so a specific case of choosing among implementations of the same API), as of Python 3, CPython automatically uses the accelerated one when available, and hides (_*
) the accelerated one. We, too, arrived at this approach:
This discussion cites this relevant excerpt:
A common pattern in Python 2.x is to have one version of a module implemented in pure Python, with an optional accelerated version implemented as a C extension; for example,
pickle
andcPickle
. This places the burden of importing the accelerated version and falling back on the pure Python version on each user of these modules. In Python 3.0, the accelerated versions are considered implementation details of the pure Python versions.
So it may be worth considering using this approach also at other places where the API allows doing so.
This comment is relevant to discussion above.
When running the version of polytope in branch
scipysolver
(as of 524f4c59782ccc31ab776ddf605ead4cb3c6de5e) with bothscipy
andcvxopt
installed, it may be beneficial to have the ability to easily choose which solver is to be used. Below is a possible implementation of this to go inpolytope.py
:Changing a global variable (
lp_solver
) may not be the best solution to switch between solvers but, given the current implementation, it is a relatively simple one. Are there any other thoughts on a safer or better way to implement this?