callbacks should be given nodes, not indices

jmarca commented 4 years ago

What version of OR-tools and what language are you using? Version: v7.0 through master Language: All (C++/Java/Python/C#)

Which solver are you using? Routing Solver

What operating system (Linux, Windows, ...) and version? Linux (Docker build using Ubuntu 20, based on or-tools/tools/Docker/ubuntu-20.04.Dockerfile)

What did you do?

NA

What did you expect to see

The callbacks currently receive indices i, j, which must be converted to "nodes" by calling manager.IndexToNode(i) and manager.IndexToNode(j). I expect that the calling program should see nodes, and should never have to call the manager.IndexToNode() function inside of the dimension callbacks.

What did you see instead?

Dimension callbacks are required to call manager.IndexToNode() twice, once for i, once for j.

It occurs to me that this is a wrong approach for two reasons.

First, the calling program knows nothing about indices. They are an internal implementation feature of the solver. It would be more logical for the solver to internally call callback(manager.IndexToNode(i), manager.IndexToNode(j)), thus presenting the callback with node_i, node_j, which the caller should understand.

Second, this approach is inefficient. In many cases the "index" for a location is degenerate. For example, all vehicles have a location index that translates to the node of the vehicle's depot. Running callback(i,j) is redundant for the set of indicies I_n that resolve to a single node n. This will result in a speed-up in the cache loop when cache_callbacks_ is true.

In addition, the two calls to manager.IndexToNode() from the calling language back to C++ "cross the language boundary" and so are slow. Much faster to do that all in C++, then present the calling language with already-resolved nodes.

As an example of the potential cost savings if caching is turned on, in my test program I am creating a unique vehicle for every shift. This approach simplifies modeling, as it avoids complicated constraints related to getting vehicles back to the depot (dummy node) at the correct time to switch drivers at the shift changes. Instead, the vehicle must return empty to the depot at the end of its time window. I have 5 depots, each depot has access to 30 vehicles, and each vehicle has 2 possible shifts (driver shifts) per day, so that is a lot of virtual vehicles. In my test case I'm getting over 5,000 vehicles, which the current implementation of the solver means there are 5,000 dummy indices. The solver is therefore caching 1000+ real nodes plus 5000+ vehicle nodes, for a 6000x6000 matrix. But this huge matrix really just boils down to just 1005x1005 once the solver's indices are replaced with the caller's nodes.

Setting aside the issue of being clever about these dummy indices, there should be a huge speed-up if the solver would just cache the result of querying nodes to the callback, not indices. In my case, at the very least, 5,000 x 5,000 queries would be reduced to just 5x5 queries for the depot to depot links alone.

But this would be a huge breaking change to the routing solver, so would definitely have to be targeted to 8.0.

In a nutshell, my thought runs along the lines of:

every time there is

evaluator(from_index, to_index)

replace it with

evaluator(_manager.IndexToNode(from_index), _manager.IndexToNode(to_index))

So far, I think the only places this happens are when the callbacks are registered (if caching is on), and in GetArcCostForClassInternal().

The bigger problem will be updating all of the examples, tests, and documentation to reflect this change.

I'm going to test this out on my clone. If there is interest I can post patches when I'm done.

If there is no interest, please close this issue.

lperron commented 4 years ago

Hi James,

I understand your point. But short answer, we will not change it. We do a lot of index manipulations internally, and hiding those will complicate things. The current separation between nodes and indices is the cleanest in our opinion, while preserving complex manipulations.

What you can do on your side is have a wrapper around the evaluator and the solution such that this index manager is hidden. You could also include the notion of duplicate nodes directly, and the wrapper would manage the indirection. Something like set_number_of_duplications(node, count).

Laurent Perron | Operations Research | lperron@google.com | (33) 1 42 68 53 00

Le jeu. 30 juil. 2020 à 20:21, James Marca notifications@github.com a écrit :

What version of OR-tools and what language are you using? Version: v7.0 through master Language: All (C++/Java/Python/C#)

Which solver are you using? Routing Solver

What operating system (Linux, Windows, ...) and version? Linux (Docker build using Ubuntu 20, based on or-tools/tools/Docker/ubuntu-20.04.Dockerfile)

What did you do?

NA

What did you expect to see

The callbacks currently receive indices i, j, which must be converted to "nodes" by calling manager.IndexToNode(i) and manager.IndexToNode(j). I expect that the calling program should see nodes, and should never have to call the manager.IndexToNode() function inside of the dimension callbacks.

What did you see instead?

Dimension callbacks are required to call manager.IndexToNode() twice, once for i, once for j.

It occurs to me that this is a wrong approach for two reasons.

First, the calling program knows nothing about indices. They are an internal implementation feature of the solver. It would be more logical for the solver to internally call callback(manager.IndexToNode(i), manager.IndexToNode(j)), thus presenting the callback with node_i, node_j, which the caller should understand.

Second, this approach is inefficient. In many cases the "index" for a location is degenerate. For example, all vehicles have a location index that translates to the node of the vehicle's depot. Running callback(i,j) is redundant for the set of indicies I_n that resolve to a single node n. This will result in a speed-up in the cache loop when cachecallbacks is true.

In addition, the two calls to manager.IndexToNode() from the calling language back to C++ "cross the language boundary" and so are slow. Much faster to do that all in C++, then present the calling language with already-resolved nodes.

As an example of the potential cost savings if caching is turned on, in my test program I am creating a unique vehicle for every shift. This approach simplifies modeling, as it avoids complicated constraints related to getting vehicles back to the depot (dummy node) at the correct time to switch drivers at the shift changes. Instead, the vehicle must return empty to the depot at the end of its time window. I have 5 depots, each depot has access to 30 vehicles, and each vehicle has 2 possible shifts (driver shifts) per day, so that is a lot of virtual vehicles. In my test case I'm getting over 5,000 vehicles, which the current implementation of the solver means there are 5,000 dummy indices. The solver is therefore caching 1000+ real nodes plus 5000+ vehicle nodes, for a 6000x6000 matrix. But this huge matrix really just boils down to just 1005x1005 once the solver's indices are replaced with the caller's nodes.

Setting aside the issue of being clever about these dummy indices, there should be a huge speed-up if the solver would just cache the result of querying nodes to the callback, not indices. In my case, at the very least, 5,000 x 5,000 queries would be reduced to just 5x5 queries for the depot to depot links alone.

But this would be a huge breaking change to the routing solver, so would definitely have to be targeted to 8.0.

In a nutshell, my thought runs along the lines of:

every time there is

evaluator(from_index, to_index)

replace it with

evaluator(_manager.IndexToNode(from_index), _manager.IndexToNode(to_index))

So far, I think the only places this happens are when the callbacks are registered (if caching is on), and in GetArcCostForClassInternal().

The bigger problem will be updating all of the examples, tests, and documentation to reflect this change.

I'm going to test this out on my clone. If there is interest I can post patches when I'm done.

If there is no interest, please close this issue.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/2123, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUPL3P6E6XAERF36XMNKNDR6G22JANCNFSM4POLFN7A .

jmarca commented 4 years ago

Well, that's what I suspected which is why I opened an issue rather than diving in and generating a patch set.

My biggest problem is calling manager.IndexToNode(i) from python. That makes no sense to me at all, because I'm calling a C++ function from python (twice) inside of a callback that is called from C++.

On the assumption that calling across the language barrier is orders of magnitude slower than sticking within a language, I actually cached everything possible in manager.IndexToNode() in python, and noticed a speed up...about a minute saved (7 minutes down to 6 minutes) when precaching all the nodes for my 6000 node case.

Logically, pushing those two calls to IndexToNode() upstream to C++ would be faster still.

Thanks for considering though

Mizux commented 3 years ago

@jmarca Did you try to use a "Matrix" transit instead of a Transit Callback ? IIRC matrix will be copied in the C++ realm so you shouldn't have any Python involved even for uncached transit(i, j) required by tthe solver during the search...

google / or-tools

callbacks should be given nodes, not indices #2123