Implement Inertial Flow Partitioner

Once we have a dummy partitioner (ticketed in https://github.com/Project-OSRM/osrm-backend/issues/3204) we unblock further steps in the pipeline. Eventually we want to replace the dummy partitioner with a proper partitioning approach: Inertial Flow.

On Balanced Separators in Road Networks Aaron Schild and Christian Sommer SEA 2015 - 14th International Symposium on Experimental Algorithms (pp. 286-297) doi: 10.1007/978-3-319-20086

See: http://sommer.jp/roadseparator.htm

We already successfully prototyped a internal routing engine with that - some hints are over at

http://daniel-j-h.github.io/post/selection-algorithms-for-partitioning/

our multiple candidate cuts and parallelization should be doable in osrm, too.

Tasks:

[x] Dinic's Max Flow / Min Cut
[x] Inertial Flow

The Inertial Flow Partitioner and the dummy partitioner outlined in https://github.com/Project-OSRM/osrm-backend/issues/3204 will share the same interface - only the implementation will change. Let's spec out how this interface will look like (and answer open questions).

Eventually we need a partition on the edge based graph for the customization to make use of it (the call site is outlined in https://github.com/Project-OSRM/osrm-backend/issues/3552). The Intertial Flow partitioner needs an embedding, which we already have for the node based graph. This and the fact that @MoKob already had some ideas how to transfer a partition on the node based graph to the edge based graph makes it a viable option to partition the node based graph already.

[x] @MoKob would you be so kind to post the details for how to transform a partition on the node based graph to the edge based graph - where should this happen, what are the limitations, how can this be done.

In case we want to make the partitioner a new tool osrm-partition reading in the node based graph and writing out a partition for the node based graph (or already for the edge based graph?) ticket https://github.com/Project-OSRM/osrm-backend/issues/2055 is to mention here: at the moment we write out the node based graph in osrm-extract and then read it back in immediately. Writing out the node based graph then still has to happen - reading it back in immediately in osrm-extract is still suboptimal.

[ ] Clarify if the ticket https://github.com/Project-OSRM/osrm-backend/issues/2055 is still relevant: do we want to write out the node based graph to decouple osrm-extract from osrm-partition now?
[ ] Clarify the interface: does osrm-partition read in the node based graph and return a partition for the node based graph? It's more likely that we want to read in both the node based graph and the edge based graph from osrm-extract, partition the node based graph, transform the partition to the edge based graph, and write that out as a .partition file.

For the partitioner's interface, let's spec this out in two steps. First, partitioning the node based graph. And second, transforming the node based graph partition to the edge based graph.

We can load the node based graph (which is a NodeBasedDynamicGraph) and partition it with Inertial Flow. The output of this is a vector (keyed by node based node) storing a partition id for each node (the partition id is generated from a 0/1 (left/right) bisection per depth).

For the partitioner interface I can imagine a similar interface to what we had in rtd (check rtd/include/partition)

vector<uint32_t> InertialFlow(const NodeBasedGraph& graph, const vector<Coordinate>& embedding);

leaving out tuning parameters and implementation specifics such as

the balance parameter for the percentage of start end end vertices to take as source and sink for the Max Flow Min Cut
the termination criterion: max depth, min nodes
the slope for the spacial order or if we should try n spacial orders taking the best
if both multiple spacial orders as well as recursion should be done in parallel (rtd did this)

On the implementation side rtd just destructed the original graph creating two subgraphs (which was somewhat expensive but worked out for North America and even when we did things in parallel). It's not clear to me if we should do this again here. Also in rtd we tried spacial orders again and again for each subgraph to get the best cut. What we could do instead this time: do n spacial orders on the full node based graph and keep them around. Never do spacial orders again but re-use.

[ ] Clarify subgraph creation / keeping track of information on the fly. Creating subgraphs is easier to handle, but probably won't scale for the planet. If we keep track of metadata the algorithms have to know about this: e.g. the max flow min cut algorithm then has to know about the implicit subgraphs inside the node based graph. I can see this getting complicated - ideas here?
[ ] Open a ticket with details about implementing the Dinic's Max Flow Min Cut algorithm on the node based graph which we need for Inertial Flow
[ ] @MoKob for the node based graph partition transformation to the edge based graph partition, would you be so kind to spec out the interface here as well

At this point we should have a partition id for each edge based graph node. We still need to combine several partition levels into a partition id usable for the customization (rtd called this annotation).

[ ] Clarify the strategy for how to combine several depths (bits in the partition id) for a single level. For North America we had a tool which tried a few annotation strategies (combining the ranges 8 16 24 32 into four levels) and returned the best.
[ ] Should this be already done in osrm-partition or is this something we can do in a separate tool / in osrm-customize? Should the .partition file be ready for customize (containing level ids) or be a full partition id map?

Please fill in the blanks and discuss! cc @MoKob @TheMarex @danpat @oxidase

To turn a node-based partition into an edge based one, there is essentially one item to consider. I tried modelling this in the image below (don't pay to much attention on meanings of nodes, its just to show how many connections there would be).

node-to-edge-based

In the top left, I've put the node based graph with a very simple partition. If we would simply use that partition, we would end up with more border nodes than we want, as shown in the right image (everything that has the color red would be, in some sense, connected to the border of the partition).

What we would have to do is to introduce artificial nodes in between every two nodes that are border nodes in the node-based-partition. We introduce a intermediary segment of length 0 that we don't snap to. Doing this, we end up with a paradigm that is very useful for us in implementing fast MLD queries. The in and the out node of a partition. Every node that is on the border has exactly one connection leaving it, one connection entering it. If you are at the out node, you don't look back into the cell you are coming from, but only forward. The in node only looks at the cell it is pointing into. This makes the query really efficient, since we don't have to scan over border vertices, we only have to scan over cells.

For the interface, I'll need some time to think about how we would best integrate this into our existing pipeline. To me it is not clear, yet, how we intend to split the different steps so that we can re-use much of what is happening. So I imagine that we should do a dedicated extract/partitioning step that creates a partitioned edge-based-graph? The partition data could potentially come in useful to guide the CH as well.

To summarise:

add additional nodes on border edges
- 2 in / 2 out nodes, 2 edges (out->in)
don't snap to the newly added edges

What about trying to adapt the inertial flow algorithm directly to the edge-based-graph? Instead of sorting by node coordinates directly, we could try sorting by centeroid of the edge-based-node (e.g. centeroid of several segments). Or as already proposed just translate the from the node-based-graph to the edge-based-graph.

The main motivation here is that we want a clear separation between algorithm independent pre-processing (osrm-extract) and algorithm dependent pre-processing (osrm-contract, osrm-partition, osrm-customize) with minimal changes to the current architecture and code base. Optimizations are great and definitely needed down the road, but speaking from personal experience here anything that requires non-trivial changes will just end up in a huge refactor. We need to be conscious of what battles we pick here and that means usually going the road of least resistance even if less optimal.

That is why I propose to work directs on the .ebg file we emit in osrm-extract. Doing this will keep the edge ids stable, enabling us to use this a drop-in-replacement.

Clarify the strategy for how to combine several depths (bits in the partition id) for a single level. For North America we had a tool which tried a few annotation strategies (combining the ranges 8 16 24 32 into four levels) and returned the best.

Try to find an automated strategy here sounds great, as we works with data sets of widely varying size and we need to account for that. Maybe we can base this on a goal for a given number of vertices per level? I think keeping the number of levels constant might be easier to work with down the road.

Should this be already done in osrm-partition or is this something we can do in a separate tool / in osrm-customize? Should the .partition file be ready for customize (containing level ids) or be a full partition id map?

It would be great if we could make osrm-partition emit a "template" for all cell matrices with the complete partition information that osrm-customize only needs to fill in the blanks.

I have a few reasons why I would not separate these processing toolchains:

From our prototype we had the following measurements: 2 thread preprocessing, 90 GB memory requirement, 60000 seconds time, so 16.6 hours

Why using node based:

the intertial flow partitioner that we already had was taking about ages to run on NA alone, node based. Blowing this up by a factor of 2-4 does not really help us getting to a reasonable time for the total planet.
There is, of course, some potential to improve that timing, but we also have to consider that the graphs we are going for are only getting larger. Dinics algorithm alone will not save us from investing in timing.
Some of the overhead is due to us focussing on high quality for the partitions. But well, thats the only reason MLD works at all. Early research on this idea has shown only very low speed-ups, since the quality of partitions was not as good as it is today
The transformation from node-based to edge-based partition has the added benefit that the artificial graph will have smaller boundaries than the edge-based partition, both improving the memory overhead and the query time of MLD

Added benefits for CH:

when a CH sees a partition present, we can use that partition to improve the contraction algorithm (guide the contraction)
having a core that is sparse (e.g. based on a partition) improves the performance of the core CH as well

Refactor Involved:

we could do the partitioning part as a runtime parameter (just as contracting with/without a core)
- drawback: the edge-based graphs would differ for both versions, we cannot simply re-use the edge-based-graph for MLD after having done the initial extract
- potential cost: the extraction for the node-based-graph would be done twice, but we could alleviate this by running a pre-filter once on the planet for adjusted tag-info files (e.g. walk/bike/car get their own taginfo and we run a c++ prefilter, making the extraction for both separate tasks faster)
the intersection classification would need to be adjusted to handle these additional zero edges
look-up for telemetry needs to do the same adjustment

In total, this would be a battle that is actually worth picking to me.

I think we need to step back here a little bit and look at this from a high-level view and set some constraints:

osrm-extract is and will remain algorithm independent. That does not mean it could not output additional data that is only relevant for MLD (e.g. dump the compressed node based graph). When we run osrm-extract we don't want to know about which algorithm we are going to use. This abstraction is the only reason osrm-extract exists: Algorithm independent pre-processing.
The algorithm dependent part will be constrained to three tools: osrm-partition, osrm-customize and osrm-contract.

This split is important to retain because our whole processing tool-chain is build around it. Ideally we can ship MLD support without anyone noticing that is is there. Breaking compatibility will set us back for at least three months (thinking back to the last time we did that) as we will need to adapt all downstream users and triage release processes.

The goal of the initial implementation is: Change as little as possible to get MLD into OSRM. That is step one, anything beyond that is something we should absolutely discuss when we actually have something running and a system we can improve incrementally.

With that constraints it means we need to keep the general structure of the edge-based-graph intact because our data indexing that we compute in osrm-extract is bound to it. Literally any partition scheme that fulfills these constraints is fine: Translating from a node-based-graph to edge-based-graph or directly working on the edge-based-graph is both fine.

This, to me, is just another case of you knowing what should be done, but telling no one in advance. Adding a single parameter to osrm-extract to generate a partition would not break the chain, at least in my view. But as long as you don't share your overall plan on OSRM and MLD, I have no idea on what should be done.

The basic steps for osrm-partition are:

Load compressed node-based-graph
Partition node-based-graph, obtain a mapping node id -> partition
Load edge based graph .ebg
Load map from node x in EBG to pair (u, v) in NBG
Assign preliminary partition to all nodes x in the EBG, if u and v are in the same partition, assign that partition.
Save all x where u and v are in different partitions.

Up until 4 should be clear by now, so I'm going to focus on 4-6.

Step 4

At this point we have a partition of the node based graph, staying in the same example as @MoKob above:

img_20170125_163204

What we need is something to translate from the NBG node ids to EBG nodes, because that is what we want to partition eventually. In order to do that we need to write out more auxiliary information while the EBG is generated to get that mapping.

The best location to capture this is probably here. You will need to write out the following information:

ID of the head (forward) node in the EBG like here
ID of the tail (backward) node in the EBG like here
ID of the NBG node node_u
ID of the NBG node node_v

Step 5

You load the edge based graph that is a list of edges and construct a (dynamic) graph for them. The best example on how to do that is in the constructor of GraphContractor.

Now we iterate over all nodes in that graph:

Find the entry in the map from step 4 where that node is either the forward or backward node
Get the ids of the two edge endpoints u, v in the NBG
Obtain their partition, compare them, if its the same partition we are good, otherwise we mark them as border nodes

img_20170125_163745

In the example above the small dots are the nodes in the EBG overlaid over the NBG for orientation. The nodes in the middle that are colored in red and green are the border vertexes that we need to fix.

Step 6

In the last step we need to resolve the problem arising from these border nodes we can't assign a partition to. There are at least two methods to fix these.

Method 1

Just pick one of the partitions. This can be done randomly or by just checking how much border edges we would introduce by picking either partition (== counting the adjacent number of nodes in the EBG that have different colors). This of course has all the problems that @MoKob outline above. However for testing this is trivial to implement and gives you a partition instantly.

img_20170125_164515

Method 2

Here we need to modify the graph in the way @MoKob suggested above. The important thing here is that we are not allowed to modify the IDs of existing nodes in the EBG (because they are saved in the StaticRTree as entry point) but additive changes are fine:

For every pair of border nodes (forward,backward) in the EBG:
- Add an additional pair of nodes the the graph
- Assign the original pair of border nodes either to either of the partitions (really does not matter)
- Copy all arcs from the original border nodes that go to (or come from) a different partition to our new pair
- Add two edges connecting the forward/backward node of the pair with weight 0

img_20170125_165353

Culprits for this approach are that we need to keep a marker around that identifies "real" arcs (e.g. non-border arcs). We can mark these edges by assigning them a special data ID (this is what we save for example here in the EBGF). Unpacking can remove these edges easily before annotation.

The base version of this shipped using a simple heuristic that picks the BisectionID of u for the forward node and the BisectionID of v for the backward node.

Project-OSRM / osrm-backend