Although the task list is incomplete, we plan to merge what has been done so far on this long-standing PR on 1/1/2022 in order to encourage broader review and comment from the community (See #219). Work will continue following the merge, but in a separate PR. More details will be provided in an accompanying announcement, which will be linked here once it's made.
Clp is is currently undergoing a major refactoring. Importantly, this refactoring is not expected to affect the performance of the code in any way. The goals only regard improving readability, maintainability, and usability.
This pull request is mainly for the purposes of documenting the changes being made so that other maintainers can review and follow along. It is also a task list that anyone who wants to contribute can consult to see what needs to be done.
See also: coin-or/CoinUtils#139
Here are some of the main goals of this refactoring.
Make code thread-safe and eliminate global variables.
Make structure of code easier to understand and maintain.
Simplify and unify the parameters setting mechanism and separate it completely from Cbc.
Convert to C++ strings/streams everywhere to unify handling of input.
Unify all messaging through message handlers and eliminate back-door methods of printing and controlling print levels.
Unify error handling
Combine separate libraries into a single library.
Make a ClpSolver class to contain what is currently ClpMain0() and ClpMain1().
Modularize the implementation of ClpMain1(), splitting the while loop into separate functions.
Here is the task list, including what has already been done.
[x] Build out the already existing parameter mechanism in CoinUtils.
[x] Build out the already existing utilities in CoinUtils for reading input from command-line, file, or environment.
[x] Create a ClpParam derived from CoinParam.
[x] Create a ClpParameters class with methods to set up parameters and contain the vector of parameters.
[x] Convert Clp to use the new parameter and input mechanisms (thus separating it completely from Cbc).
[x] Convert to using string keywords rather than integer modes to the extent possible for code readability and also eliminate other "magic numbers" where possible.
[x] Create new parameter types for file names and directories so that these can be parsed and validated correctly to ensure they represent valid paths, addressing existing bugs.
[x] Create constructors in the ClpParameters class representing different possible parameters settings (e.g., emphasize optimality versus feasibility).
[x] Ensure that default parameter values make sense.
[x] Create enums for the keyword parameters.
[x] Eliminate global variables and ensure thread safety
[x] Use modern C++ string manipulation
[x] Eliminate direct printing to std::cout and make sure all output (at least in ClpSolver) is controlled by the log parameter through the proper message handler.
[x] Eliminate the separate libClpSolver, which didn't seem to serve a purpose.
[ ] Change all integers mode values to their corresponding enums to make code more readable and maintainable.
[ ] Try to convert parameters set in obscure ways (special options) to a more transparent mechanism.
[ ] Build a ClpSolver class.
[ ] Modularize ClpMain1, moving local variables into the ClpSolver class.
[ ] Use existing helper functions and create new ones to avoid replication of code blocks within Clp and between Clp and Cbc.
[ ] Implement better error handling.
[ ] Eliminate irrelevant/old #ifdefs or change them to parameters instead.
[ ] Test extensively against current master branch to identify any regressions.
Things to consider
[ ] Use parameter "push" functions to do the work currently being done in much of CbcMain1 (this is up in the air).
[ ] Use readline to enable automatic command completion and further simplify parameter setting.
[ ] Create base class in CoinUtils to unify ClpParameters and CbcParameters
Some details about particular aspects of what has already been done.
Parameter mechanism
There are two main classes and some utilities.
The ClpParam class is derived from CoinParam and hold information for a single parameter (value, type, upper and lower limits, help strings, etc.).
The ClpParameters contains a vector of ClpParam objects representing the current settings.
ClpParam
Each parameter has a unique code, which is also its index in the parameter vector stored in the ClpParameters class (described below). The codes are specified in the ClpParamCode enum within the ClpParam class.
Parameters also have a type, which can be queried and is one of
Double: Parameters whose value is a double
Integer: Parameters whose value is an integer
String: Parameters whose value is a string
Directory: For storing directory names, such as default locations for file types
File: For storing the locations/names of files
Keyword: Parameters that have one of several enumerated values identified by either strings or corresponding integer modes.
Action: Not parameters in the traditional sense, but used to trigger actions, such as solving an instance.
There are different constructors for each type, as well as setup functions for populating an existing parameter object with
Name
Short help string
Long help string
Limits on values (for error checking)
Defaults values
Display priority (to limit which parameters get displayed when users ask for help)
For keyword parameters, there is also a mechanism for creating a mapping between keyword value strings and the so-called "mode" value, which is an integer. The value of a keyword parameter can be set or retrieved either as a keyword string or as an integer mode. The modes may also be specified in enums.
There are separate get/set methods for each parameter type, but for convenience, there is also a single setVal() and getVal() method that is overloaded and can set the value of any parameter (the input/output is interpreted based on the known parameter type), e.g.,
param.setVal(0);
Each parameter object also has optional "push" (and "pull") function that can perform actions or populate related data structures upon the setting of a parameter value. The push/pull functions are defined with the ClpParamUtils name space, described below.
None of the methods in the class produce output directly. The get/set methods can optionally populate a string object that is passed in as an optional second argument. This is to allow the calling function to control output by piping the string to a message handler as appropriate or simply ignoring it if printing is not desired. This obviates the need to control printing internal to the functions themselves, which would require a separate parameter.
The ClpParameters class serves primarily as a container for a vector of parameters, but can (and will) be used to store other auxiliary information that needs to be accessed by the methods of (soon-to-be) ClpSolver class. The class has methods for setting parameter values, which are pass-throughs to the methods of the ClpParam class. It also defines a [] operator, so that parameter objects contained in the parameter vector, which is a class member, can be directly accessed, e.g.,
The constructor of the class calls a method, which sets up all the parameters (as described above). It is envisioned that different constructors will eventually be used to obtain different sets of parameters for different purposes.
ClpParamUtils
There are a number of utilities contained in the ClpParamUtils namespace. Mostly, these are the push functions that can extend the functionality of the set methods of the ClpParam class.
Input/Output mechanism
The input/output mechanism is used to pass a sequence of parameters to Clp, including action parameters. This parameter sequence can be set equivalently passed in five different ways.
Command line argument
Interactive prompt
Environment variable
Parameter file
Programatically from a driver
Regardless of the method, the parameter sequence is stored and accessed as a FIFO queue of strings (inputQueue). This can be passed as an input to what is currently ClpMain1 or constructed within the main while loop by parsing either interactive input, the contents of an environment variable, or the contents of a parameter file.
The main while loop then pops strings off the input queue, looks them up in the parameters list and deals with the result.
Although the task list is incomplete, we plan to merge what has been done so far on this long-standing PR on 1/1/2022 in order to encourage broader review and comment from the community (See #219). Work will continue following the merge, but in a separate PR. More details will be provided in an accompanying announcement, which will be linked here once it's made.
Clp is is currently undergoing a major refactoring. Importantly, this refactoring is not expected to affect the performance of the code in any way. The goals only regard improving readability, maintainability, and usability.
This pull request is mainly for the purposes of documenting the changes being made so that other maintainers can review and follow along. It is also a task list that anyone who wants to contribute can consult to see what needs to be done.
See also: coin-or/CoinUtils#139
Here are some of the main goals of this refactoring.
ClpMain0()
andClpMain1()
.ClpMain1()
, splitting the while loop into separate functions.Here is the task list, including what has already been done.
CoinUtils
.CoinUtils
for reading input from command-line, file, or environment.ClpParam
derived fromCoinParam
.ClpParameters
class with methods to set up parameters and contain the vector of parameters.ClpParameters
class representing different possible parameters settings (e.g., emphasize optimality versus feasibility).std::cout
and make sure all output (at least inClpSolver
) is controlled by the log parameter through the proper message handler.libClpSolver
, which didn't seem to serve a purpose.ClpSolver
class.ClpMain1
, moving local variables into theClpSolver
class.#ifdef
s or change them to parameters instead.master
branch to identify any regressions.Things to consider
readline
to enable automatic command completion and further simplify parameter setting.CoinUtils
to unifyClpParameters
andCbcParameters
Some details about particular aspects of what has already been done.
Parameter mechanism
There are two main classes and some utilities.
ClpParam
class is derived fromCoinParam
and hold information for a single parameter (value, type, upper and lower limits, help strings, etc.).ClpParameters
contains a vector ofClpParam
objects representing the current settings.ClpParam
Each parameter has a unique code, which is also its index in the parameter vector stored in the
ClpParameters
class (described below). The codes are specified in theClpParamCode
enum within theClpParam
class.Parameters also have a type, which can be queried and is one of
Double
: Parameters whose value is a doubleInteger
: Parameters whose value is an integerString
: Parameters whose value is a stringDirectory
: For storing directory names, such as default locations for file typesFile
: For storing the locations/names of filesKeyword
: Parameters that have one of several enumerated values identified by either strings or corresponding integer modes.Action
: Not parameters in the traditional sense, but used to trigger actions, such as solving an instance.There are different constructors for each type, as well as
setup
functions for populating an existing parameter object withFor keyword parameters, there is also a mechanism for creating a mapping between keyword value strings and the so-called "mode" value, which is an integer. The value of a keyword parameter can be set or retrieved either as a keyword string or as an integer mode. The modes may also be specified in enums.
There are separate get/set methods for each parameter type, but for convenience, there is also a single
setVal()
andgetVal()
method that is overloaded and can set the value of any parameter (the input/output is interpreted based on the known parameter type), e.g.,Each parameter object also has optional "push" (and "pull") function that can perform actions or populate related data structures upon the setting of a parameter value. The push/pull functions are defined with the
ClpParamUtils
name space, described below.None of the methods in the class produce output directly. The get/set methods can optionally populate a string object that is passed in as an optional second argument. This is to allow the calling function to control output by piping the string to a message handler as appropriate or simply ignoring it if printing is not desired. This obviates the need to control printing internal to the functions themselves, which would require a separate parameter.
ClpParameters
The
ClpParameters
class serves primarily as a container for a vector of parameters, but can (and will) be used to store other auxiliary information that needs to be accessed by the methods of (soon-to-be)ClpSolver
class. The class has methods for setting parameter values, which are pass-throughs to the methods of theClpParam
class. It also defines a[]
operator, so that parameter objects contained in the parameter vector, which is a class member, can be directly accessed, e.g.,The constructor of the class calls a method, which sets up all the parameters (as described above). It is envisioned that different constructors will eventually be used to obtain different sets of parameters for different purposes.
ClpParamUtils
There are a number of utilities contained in the
ClpParamUtils
namespace. Mostly, these are the push functions that can extend the functionality of the set methods of the ClpParam class.Input/Output mechanism
The input/output mechanism is used to pass a sequence of parameters to Clp, including action parameters. This parameter sequence can be set equivalently passed in five different ways.
Regardless of the method, the parameter sequence is stored and accessed as a FIFO queue of strings (
inputQueue
). This can be passed as an input to what is currentlyClpMain1
or constructed within the mainwhile
loop by parsing either interactive input, the contents of an environment variable, or the contents of a parameter file.The main
while
loop then pops strings off the input queue, looks them up in the parameters list and deals with the result.