google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
10.97k stars 2.1k forks source link

The Routing Solver "returned a result with an error set" when changing first solution algorithm #1890

Closed snikolakis closed 4 years ago

snikolakis commented 4 years ago

Hello. I use Python 3.7 and the or-tools I have installed is v. 7.5.7466

I am executing the Routing Solver on my data and when the First Solution algorithm is set to PATH_CHEAPEST_ARC it works like a charm. However, I am not very satisfied with the result and I tried to experiment a bit with the metaheuristics (it does not perform any better) and concluded that the problem might be with the algorithm that has been selected to find the first solution.

I tried using a number of different algorithms, for example:

but it always results in the following exception:

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 297, in time_callback
    from_node = self.manager.IndexToNode(from_index)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3558, in IndexToNode
    return _pywrapcp.RoutingIndexManager_IndexToNode(self, index)
OverflowError: in method 'RoutingIndexManager_IndexToNode', argument 2 of type 'int64'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw.py", line 444, in <module>
    manager, routing, assignment = vrp.solve()
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 259, in solve
    self.search_parameters)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3928, in SolveWithParameters
    return _pywrapcp.RoutingModel_SolveWithParameters(self, search_parameters, solutions)
SystemError: <built-in function RoutingModel_SolveWithParameters> returned a result with an error set

I do not think there is an error with my data as it works for a different algorithm, but I am not very sure what should I check or test. Any advise is welcome and please let me know whether you need more information. Thank you in advance.

Mizux commented 4 years ago

Can you double check that your time dimension callback is using integer type ?

snikolakis commented 4 years ago

Thank you for the response.

My function's signature is def time_callback(self, from_index, to_index): and I have checked that both from_index and to_index are of int type (hopefully, I have understood correctly your answer). However, my return statement will return a float but it does not even reach that point. It actually raises an exception when this line is run:

from_node = self.manager.IndexToNode(from_index)

which is the first line of the function.

Mizux commented 4 years ago

Are you on linux ?

v7.5 code

On stable or v7.5 tag we were using SWIG "32bits mode" so int64 (aka int64_t) was typedef to long long while on Linux Gcc/Clang define it as long int

So first looking at pywrapcp.py:3558

class RoutingIndexManager(object):
...
def IndexToNode(self, index: "int64") -> "operations_research::RoutingIndexManager::NodeIndex":
  return _pywrapcp.RoutingIndexManager_IndexToNode(self, index)

then looking at the wrapper code

SWIGINTERN PyObject *_wrap_RoutingIndexManager_IndexToNode(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
...
  ecode2 = SWIG_AsVal_long_SS_long(obj1, &val2);
  if (!SWIG_IsOK(ecode2)) {
    SWIG_exception_fail(SWIG_ArgError(ecode2), "in method '" "RoutingIndexManager_IndexToNode" "', argument " "2"" of type '" "int64""'");
  } 
  arg2 = static_cast< int64 >(val2);
...
}

...

#ifdef SWIG_LONG_LONG_AVAILABLE
SWIGINTERN int
SWIG_AsVal_long_SS_long (PyObject *obj, long long *val)
{
  int res = SWIG_TypeError;
  if (PyLong_Check(obj)) {
    long long v = PyLong_AsLongLong(obj);
    if (!PyErr_Occurred()) {
      if (val) *val = v;
      return SWIG_OK;
    } else {
      PyErr_Clear();
      res = SWIG_OverflowError;
    }
  } else {
    long v;
    res = SWIG_AsVal_long (obj,&v);
    if (SWIG_IsOK(res)) {
      if (val) *val = v;
      return res;
    }
  }
#ifdef SWIG_PYTHON_CAST_MODE
  {
    const double mant_max = 1LL << DBL_MANT_DIG;
    const double mant_min = -mant_max;
    double d;
    res = SWIG_AsVal_double (obj,&d);
    if (SWIG_IsOK(res) && !SWIG_CanCastAsInteger(&d, mant_min, mant_max))
      return SWIG_OverflowError;
    if (SWIG_IsOK(res) && SWIG_CanCastAsInteger(&d, mant_min, mant_max)) {
      if (val) *val = (long long)(d);
      return SWIG_AddCast(res);
    }
    res = SWIG_TypeError;
  }
#endif
  return res;
}
#endif

On Master

Since now we are using -DSWIGWORDSIZE64 we have in the wrapper

SWIGINTERN PyObject *_wrap_RoutingIndexManager_IndexToNode(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
...
  ecode2 = SWIG_AsVal_long(obj1, &val2);
  if (!SWIG_IsOK(ecode2)) {
    SWIG_exception_fail(SWIG_ArgError(ecode2), "in method '" "RoutingIndexManager_IndexToNode" "', argument " "2"" of type '" "int64""'");
  } 
  arg2 = static_cast< int64 >(val2);
...
}
...
SWIGINTERN int
SWIG_AsVal_long (PyObject *obj, long* val)
{
#if PY_VERSION_HEX < 0x03000000
  if (PyInt_Check(obj)) {
    if (val) *val = PyInt_AsLong(obj);
    return SWIG_OK;
  } else
#endif
  if (PyLong_Check(obj)) {
    long v = PyLong_AsLong(obj);
    if (!PyErr_Occurred()) {
      if (val) *val = v;
      return SWIG_OK;
    } else {
      PyErr_Clear();
      return SWIG_OverflowError;
    }
  }
#ifdef SWIG_PYTHON_CAST_MODE
  {
    int dispatch = 0;
    long v = PyInt_AsLong(obj);
    if (!PyErr_Occurred()) {
      if (val) *val = v;
      return SWIG_AddCast(SWIG_OK);
    } else {
      PyErr_Clear();
    }
    if (!dispatch) {
      double d;
      int res = SWIG_AddCast(SWIG_AsVal_double (obj,&d));
      if (SWIG_IsOK(res) && SWIG_CanCastAsInteger(&d, LONG_MIN, LONG_MAX)) {
    if (val) *val = (long)(d);
    return res;
      }
    }
  }
#endif
  return SWIG_TypeError;
}
snikolakis commented 4 years ago

Yes, this problem is found on Linux. So, is it an or-tools bug?

Mizux commented 4 years ago

I might be an issue with or-tools, do you have a minimal sample that you can share with us to reproduce it ?

Thx

snikolakis commented 4 years ago

I am sorry but my data is pretty complex, I cannot give you a minimal sample. I could share a sample of my data containing my distances matrix, time windows and time matrix but as aforementioned they will be certainly not minimal.

snikolakis commented 4 years ago

Hello again. I am attaching you a file containing my data and hopefully it could be of some help. dummy.txt

Mizux commented 4 years ago

Maybe I may found your issue (need to double check), it seems you are using a python method i.e. having a self parameter while we expect a "free" function i.e. the prototype should be def time_callback(from_index, to_index):

My guess, is this self must be a memory address and it is taken by the swig wrapper as the argument value for from_node thus the overflow.

@regressaur:

On my way to verify this assertion...

snikolakis commented 4 years ago

I tried using a "free function" and got the following error (please notice that with PATH_CHEAPEST_ARC it works once again fine):

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 297, in distance_callback
    from_node = manager.IndexToNode(from_index)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3558, in IndexToNode
    return _pywrapcp.RoutingIndexManager_IndexToNode(self, index)
OverflowError: in method 'RoutingIndexManager_IndexToNode', argument 2 of type 'int64'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw.py", line 444, in <module>
    manager, routing, assignment = vrp.solve()
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 267, in solve
    self.search_parameters)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3928, in SolveWithParameters
    return _pywrapcp.RoutingModel_SolveWithParameters(self, search_parameters, solutions)
SystemError: <built-in function RoutingModel_SolveWithParameters> returned a result with an error set

This is how I register my time_callback (inside my class):

self.time_callback_index = self.routing.RegisterTransitCallback(time_callback)

self.routing.AddDimension(
            self.time_callback_index,
            30,  # allow waiting time
            data['time_horizon'],  # maximum time per vehicle
            False,  # Don't force start cumul to zero.
            'Time')

self.time_dimension = self.routing.GetDimensionOrDie('Time')

The time_callback is now located outside my class and it uses the global variables that have been set from inside the class constructor and it looks like this:

def time_callback(from_index, to_index):
    from_node = manager.IndexToNode(from_index)
    to_node = manager.IndexToNode(to_index)               
    return data['service_time_matrix'][from_node] + data['time_matrix'][from_node][to_node]

Thank you for your time, @Mizux !

Mizux commented 4 years ago

Hi,

Would it be possible to also store the time_callback ? note: internally the callback is passed to the C++ underlying library and may be reclaimed by the python Garbage Collector... I mean:

self.time_callback = time_callback
self.time_callback_index = self.routing.RegisterTransitCallback(self.time_callback)
snikolakis commented 4 years ago

Hi,

I tried this but I have the same exception as stated above.

Mizux commented 4 years ago

Can you give me the output of:

python3 --version
python3 -c "import platform; print(platform.architecture()[0])"

(supposing your python interpreter is python3)

snikolakis commented 4 years ago

I am using Python 3.7.3 (64bit).

ecotner commented 4 years ago

Hi, I was having a similar problem (getting the exact same error in RoutingIndexManager_IndexToNode), and was able to resolve it, maybe my experience will help you...

I found that the problem was that my callback was referencing a variable that didn't exist in the scope (it was an attribute of my wrapper class). Relevant snippet here:

class VRP:
    ...
    def _add_dist_time_callbacks(self):
        # Register distance/time matrix callbacks with solver
        def time_callback(from_index, to_index):
            from_node = self.manager.IndexToNode(from_index)
            to_node = self.manager.IndexToNode(to_index)
            return self.data["time_matrix"][from_node, to_node]

        def dist_callback(from_index, to_index):
            from_node = self.manager.IndexToNode(from_index)
            to_node = self.manager.IndexToNode(to_index)
            return self.data["distance_matrix"][from_node, to_node]

        self.DIST_CALLBACK_IDX = self.routing.RegisterTransitCallback(dist_callback)
        self.TIME_CALLBACK_IDX = self.routing.RegisterTransitCallback(time_callback)

where I made the replacement data -> self.data. I don't know why the error was pointing to the IndexToNode method, since clearly it was a problem with an undefined variable. My guess is that any error within the callback is handled similarly.

Some other things that may be contributing: 1) I looked at your dummy.txt data, it looks like your distance/time matrices are floating point; I believe the solver only accepts integers (though I think it may silently convert them?) 2) In your earlier comment, it looks like you're returning return data['service_time_matrix'][from_node] + data['time_matrix'][from_node][to_node]; is it possible that you're incorrectly indexing data['service_time_matrix'][from_node]? (If it's a matrix, it should have both from/to indexes.)

I'm also on linux, using 64bit Python 3.7.4. I can verify it works with AUTOMATIC, (PATH|LOCAL|GLOBAL)_CHEAPEST_ARC (I wasn't able to find feasible solutions for all of them within 30 seconds, but they don't error out like I was getting before).

snikolakis commented 4 years ago

Hi @ecotner , thank you for the time you took to respond to this. I removed self.data and used data instead as @Mizux proposed on a response above but neither of them works for me.

Now, for the notes:

  1. I tried converting all of my floats to int32 but I did not achieve something different result-wise.
  2. I double checked my matrices and what their shapes are and they are written as expected.

Could you please share with me how you call the def _add_dist_time_callbacks(self) function? I believe that this is where I am doing a mistake and it seems that both yours and @Mizux answers converge to that. Thank you in advance!

ecotner commented 4 years ago

@regressaur there isn't really anything special about the way that I call the _add_dist_time_callbacks function. It's just in my __init__:

class VRP:
    def __init__(initial_data, ...):
        ...
        self._add_dist_time_callbacks()
        ...
ecotner commented 4 years ago

@regressaur

The whole source code is too long to post, but this is how I'm initializing my distance/time matrices, their associated callbacks, and time window constraints. Hope it is useful to you.

class VRP:
    def __init__(
        self,
        stop_df: pd.DataFrame,
        vehicle_df: pd.DataFrame,
        org_df: pd.DataFrame,
        precision: int = 2000,
    ):
        """Creates an object for solving the vehicle routing problem.

        Args:
            stop_df (DataFrame): DataFrame containing information about the delivery
                stops and their potential constraints. Has columns [location_cd,
                latitude, longitude, volume_ft3, weight_lbs, time_min, time_max]
            vehicle_df (DataFrame): DataFrame containing info about the delivery
                vehicles and their potential constraints. Has columns [equipment_id,
                dist_max_mi, time_max_hr, vol_max_ft3]
        """
        self.stop_df = stop_df.copy()
        self.vehicle_df = vehicle_df.copy()
        self.org_df = org_df.copy()
        # Sanity/validation checks
        self.stop_df["is_dc"] = self.stop_df["location_cd"] == "DEPOT"
        assert self.stop_df["is_dc"].sum() == 1  # Make sure exactly one DC
        self.stop_df.sort_values("is_dc", ascending=False, inplace=True)
        assert (
            self.stop_df[["latitude", "longitude"]].isna().sum().sum() == 0
        )  # verify coords
        # Generate dist/time data
        self.data = dict(precision=precision)
        self._get_dist_time_matrix(self.stop_df, precision)
        # Setup solver
        self.manager = pywrapcp.RoutingIndexManager(
            len(self.stop_df), len(self.vehicle_df), self.data["depot"],
        )
        self.routing = pywrapcp.RoutingModel(self.manager)
        # Add callbacks
        self._add_dist_time_callbacks()
        # Add constraints
        self._add_volume_constraints(self.stop_df, self.vehicle_df, precision)
        self._add_time_window_constraints(self.stop_df)
        # Specify objective
        self._minimize_distance()

    def _get_dist_time_matrix(self, stop_df: pd.DataFrame, precision: int):
        # Get the distance/time matrices from OSRM
        coords = stop_df[["latitude", "longitude"]].values
        conn = OSRM()
        D, T = conn.matrix(coords, dist_or_time="both")
        # Format for the solver
        data = self.data
        data["dist_scale_factor"] = precision / D.max()
        data["distance_matrix"] = (data["dist_scale_factor"] * D).round().astype(int)
        data["time_scale_factor"] = precision / T.max()
        data["time_matrix"] = (data["time_scale_factor"] * T).round().astype(int)
        data["depot"] = 0

    def _add_dist_time_callbacks(self):
        # Register distance/time matrix callbacks with solver
        def time_callback(from_index, to_index):
            from_node = self.manager.IndexToNode(from_index)
            to_node = self.manager.IndexToNode(to_index)
            return self.data["time_matrix"][from_node, to_node]

        def dist_callback(from_index, to_index):
            from_node = self.manager.IndexToNode(from_index)
            to_node = self.manager.IndexToNode(to_index)
            return self.data["distance_matrix"][from_node, to_node]

        self.DIST_CALLBACK_IDX = self.routing.RegisterTransitCallback(dist_callback)
        self.TIME_CALLBACK_IDX = self.routing.RegisterTransitCallback(time_callback)

    ...

    def _add_time_window_constraints(self, stop_df):
        data = self.data
        time_factor = data["time_scale_factor"]
        data["time_windows"] = (
            (time_factor * stop_df[["time_min", "time_max"]].values).round().astype(int)
        )

        self.routing.AddDimension(
            self.TIME_CALLBACK_IDX,
            int(time_factor * 100),  # allow waiting time at locations (slack)
            int(time_factor * 100),  # Maximum vehicle time?
            False,  # Don't force start cumul to zero
            "Time",  # Name of the dimension
        )
        time_dimension = self.routing.GetDimensionOrDie("Time")
        # Add time window constraints for every location except depot
        for loc_idx, time_window in enumerate(data["time_windows"]):
            if loc_idx == 0:
                continue
            index = self.manager.NodeToIndex(loc_idx)
            time_dimension.CumulVar(index).SetRange(
                int(time_window[0]), int(time_window[1])
            )
        # Add time window constraints for each vehicle start node.
        for vehicle_id in range(data["num_vehicles"]):
            index = self.routing.Start(vehicle_id)
            time_dimension.CumulVar(index).SetRange(
                int(data["time_windows"][0, 0]), int(data["time_windows"][0, 1]),
            )
        for i in range(data["num_vehicles"]):
            self.routing.AddVariableMinimizedByFinalizer(
                time_dimension.CumulVar(self.routing.Start(i))
            )
            self.routing.AddVariableMinimizedByFinalizer(
                time_dimension.CumulVar(self.routing.End(i))
            )

    ...
snikolakis commented 4 years ago

Thank you very much for your response once again, @ecotner . I changed my code to reflect the "trick" of nested functions as you showed above in order to remove the self argument from the callbacks but to no avail. I have also tried changing all of my float numbers to integers as you recommended in your previous message. The exception raised is always the same and I cannot figure out why it is raised in the first place. The traceback looks like the following:

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 261, in time_callback
    from_node =self.manager.IndexToNode(from_index)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3558, in IndexToNode
    return _pywrapcp.RoutingIndexManager_IndexToNode(self, index)
OverflowError: in method 'RoutingIndexManager_IndexToNode', argument 2 of type 'int64'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw.py", line 444, in <module>
    manager, routing, assignment = vrp.solve()
  File "/home/ross/code/python/cvrptw/cvrptw/cvrptw_model.py", line 219, in solve
    self.search_parameters)
  File "/home/ross/.local/lib/python3.7/site-packages/ortools/constraint_solver/pywrapcp.py", line 3928, in SolveWithParameters
    return _pywrapcp.RoutingModel_SolveWithParameters(self, search_parameters, solutions)
SystemError: <built-in function RoutingModel_SolveWithParameters> returned a result with an error set
orhanhenrik commented 4 years ago

I had the exact same exception, and after reading @ecotner's comment I figured it out. I had code in my callback function that threw an exception (undefined variable).
For some reason the stack traces point to the right callback, but to the IndexToNode function rather than the crashing code.

snikolakis commented 4 years ago

Hi @orhanhenrik thank you for your response. I have left the project now, but a colleague of mine, found out that there was a problem with some of our constraints. I cannot elaborate more on this as I do not know any details but the conclusion is that or-tools work as they should (at least for our case).

Thank you all!

harymitchell commented 3 years ago

Wanted to chime in here for anyone running or-tools library from inside of a job queue like https://github.com/rq/rq. See https://github.com/google/or-tools/issues/1945#issuecomment-859731315

th-yoon commented 1 year ago

I solved this issue by changing the data type of the distance matrix from float to int.