google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
10.83k stars 2.09k forks source link

Routing: Exception in transit callback is silently ignored and gives cost 0 (Python) #3224

Open PeterSR opened 2 years ago

PeterSR commented 2 years ago

What version of OR-Tools and what language are you using?

Version: 9.1.9490 Language: Python

Which solver are you using (e.g. CP-SAT, Routing Solver, GLOP, BOP, Gurobi)

Routing Solver

What operating system (Linux, Windows, ...) and version?

Linux

What did you do?

Steps to reproduce the behavior:

  1. Download complete program from https://developers.google.com/optimization/routing/vrp#entire_program1. Add raise RuntimeError() somewhere in distance_callback. Example:

    # Create and register a transit callback.
    def distance_callback(from_index, to_index):
        """Returns the distance between the two nodes."""
        # Convert from routing variable Index to distance matrix NodeIndex.
        from_node = manager.IndexToNode(from_index)
        to_node = manager.IndexToNode(to_index)
        raise RuntimeError()  # <----------- Simulate exception in callback
        return data['distance_matrix'][from_node][to_node]
  2. Run program.

  3. Objective value will be 0. Similarly, evaluations to routing.GetArcCostForVehicle will return 0.

What did you expect to see

Early termination of the program with a Python stacktrace. Alternatively a C++ crash or even a segmentation fault. Or simply a warning printed from C++.

What did you see instead?

Solution is found and objective value is 0. Similarly, evaluations to routing.GetArcCostForVehicle returns 0.

Objective: 0
Route for vehicle 0:
 0 -> 0
Distance of the route: 0m

Route for vehicle 1:
 0 -> 0
Distance of the route: 0m

Route for vehicle 2:
 0 -> 0
Distance of the route: 0m

Route for vehicle 3:
 0 ->  16 ->  15 ->  14 ->  13 ->  12 ->  11 ->  10 ->  9 ->  8 ->  7 ->  6 ->  5 ->  4 ->  3 ->  2 ->  1 -> 0
Distance of the route: 0m

Maximum of the route distances: 0m

Note

This potential bug was discovered in a much more subtle setting: A dummy node had been added and instead of updating the distance matrix with values, the transit cost was simply handled in the callback as follows:

    def distance_callback(from_index, to_index):
        """Returns the distance between the two nodes."""
        from_node = manager.IndexToNode(from_index)
        to_node = manager.IndexToNode(to_index)

        if from_node == dummy_depot_node:
            return 0

        # Naively assumed it would never try to drive back to the dummy_depot_node
        # as the end node was specified to something else.

        return dist_matrix[from_node][to_node]

It worked fine when from_node was the dummy node, but when to_node was the dummy node, it would yield an IndexError on dist_matrix that would be silently ignored. My solution was of course to do something like

        if to_node == dummy_depot_node:
            return 2**30

But I believe there is still some unintended behavior here.

Mizux commented 2 years ago

dev note:

routingPYTHON_wrap.cxx:

SWIGINTERN PyObject *_wrap_RoutingModel_RegisterPositiveTransitCallback(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
  PyObject *resultobj = 0;
  operations_research::RoutingModel *arg1 = (operations_research::RoutingModel *) 0 ;
  operations_research::RoutingModel::TransitCallback2 arg2 ;
  void *argp1 = 0 ;
  int res1 = 0 ;
  PyObject * obj0 = 0 ;
  PyObject * obj1 = 0 ;
  int result;

  if (!PyArg_UnpackTuple(args, "RoutingModel_RegisterPositiveTransitCallback", 2, 2, &obj0, &obj1)) SWIG_fail;
  res1 = SWIG_ConvertPtr(obj0, &argp1,SWIGTYPE_p_operations_research__RoutingModel, 0 |  0 );
  if (!SWIG_IsOK(res1)) {
    SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "RoutingModel_RegisterPositiveTransitCallback" "', argument " "1"" of type '" "operations_research::RoutingModel *""'"); 
  }
  arg1 = reinterpret_cast< operations_research::RoutingModel * >(argp1);
  {
    SharedPyPtr input(obj1);
    arg2 = [input](int64_t i, int64_t j) {
      return InvokePythonCallableReturning<int64_t>(input.get(), "LL", i, j);
    };
  }
  {
    try {
      result = (int)(arg1)->RegisterPositiveTransitCallback(arg2); 
    }
    catch (Swig::DirectorException &e) {
      SWIG_fail; 
    }
  }
  resultobj = SWIG_From_int(static_cast< int >(result));
  return resultobj;
fail:
  return NULL;
}

return InvokePythonCallableReturning<int64_t>(input.get(), "LL", i, j);

routingPYTHON_wrap.cxx:

template <typename ReturnT>
static ReturnT HandleResult(PyObject* pyresult) {
  // This zero-initializes builtin types.
  ReturnT result = ReturnT();
  if (!pyresult) {
    if (!PyErr_Occurred()) {
      PyErr_SetString(PyExc_RuntimeError,
                      "SWIG std::function invocation failed.");
    }
    return result;
  } else {
    if (!PyObjAs<ReturnT>(pyresult, &result)) {
      if (!PyErr_Occurred()) {
        PyErr_SetString(PyExc_RuntimeError,
                        "SWIG std::function invocation failed.");
      }
    }
    Py_DECREF(pyresult);
  }
  return result;
}

template <>
void HandleResult<void>(PyObject * pyresult) {
  if (!pyresult) {
    if (!PyErr_Occurred()) {
      PyErr_SetString(PyExc_RuntimeError,
                      "SWIG std::function invocation failed.");
    }
  } else {
    Py_DECREF(pyresult);
  }
}

template <typename ReturnT, typename... Args>
static ReturnT InvokePythonCallableReturning(PyObject* pyfunc,
                                             const char* format, Args... args) {
  // The const_cast is safe (it's here only because the python API is not
  // const-correct).
  return HandleResult<ReturnT>(
      PyObject_CallFunction(pyfunc, const_cast<char*>(format), args...));
}

template <typename ReturnT>
static ReturnT InvokePythonCallableReturning(PyObject* pyfunc) {
  return HandleResult<ReturnT>(PyObject_CallFunctionObjArgs(pyfunc, nullptr));
}

ref

PeterSR commented 2 years ago

As a workaround, I created a decorator that can be used to ensure expected behavior:

import os
import functools
import traceback

def ortools_callback(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception:
            traceback.print_exc()
            os._exit(1)

    return wrapper

    @ortools_callback
    def distance_callback(from_index, to_index):
        """Returns the distance between the two nodes."""
        from_node = manager.IndexToNode(from_index)
        to_node = manager.IndexToNode(to_index)
        raise RuntimeError()
        return dist_matrix[from_node][to_node]