Failing to Route due to SCC component detection

MoKob commented 7 years ago

In this route query, we don't find a route.

The background here is that we don't detect the road of the target as a small component, but we cannot find a valid route. Our algorithm to find small components and our routing engine seem to be out of sync in that regard.

I've reproduced the issue in a cucumber case (https://github.com/Project-OSRM/osrm-backend/pull/3688).

The turn restrictions forbid entering the road without turning around.

We should probably detect it as unreachable in this case, since the only other way to reach it would be to turn around on a segment.

Other ways to fix it would be to allow u turns on roads, which is kind of opposite to what I feel we should do.

MoKob commented 7 years ago

Turns out, this problem goes deeper than I thought: As already stated in the PR:

In the edge based graph edges can be of different components, even though they are the same road.

e.g. a-b with u-turns disabled on both ends will have a component for a-b and a component for b-a. So either we allow turning around on all segments, or we would require a component ID for all forward/backward edges and adjust snapping accordingly.

I see two options here:

add a component ID for forward and one for the backward direction
- breaks data format
- increases memory consumption
- correct routing/snapping
allow u-turns on all road segments
- lots of impossible u-turns
- could be kept in check by making u-turns to be silly expensive (now that we have #77)

Opinions ? /cc @danpat @daniel-j-h @oxidase @TheMarex

danpat commented 7 years ago

Oh boy - a rock and a hard place.

If we take option (1), does it un-block anything else that's useful (i.e. can we use the component ID for anything else nice)?

If we use (2), can we make them INVALID_EDGE_WEIGHT expensive and add a condition to the search to not explore these? Does this approach come with a memory cost?

MoKob commented 7 years ago

@danpat well, in my view the only thing we would get from option (1) is to be actually correct in what we do. Right now the snapping is depending on the small component IDs. As long as we don't rework this, we depend on them to be correct.

Since I don't know any good method yet to do snapping without the small components in a consistent way, we should ensure our algorithm to operate correctly. Its not a major issue, since it depends on small components and some connectivity around them, but we have a lot of these components and it is important enough that it has come up multiple times already.

Separating forward and backward edges of the search is, in my view, the only actually correct method to do so. Currently we store edges combined for both directions in a single data entry instead of storing two separate edges. This combination does not work correctly, since some of the information is different in both directions, leading to different flags/other entries. Some of them are already duplicated, some are not.

One example for the non duplicated entry is the name, another one is the component ID. For a clean solution, I'd like to keep two separate edges around. While we could limit the amount of additional memory by not repeating many of the flags on any edge but actually give them a component ID, we could end up seeing some complications.

The main issue I see is that anything would have to handle two edges which is currently handling only one. However, this one edge technically comes with a forward and a backward segment, so we already handle two. As long as we don't want to split edges, we can only add the additional data to the edges, requiring and int more space per edge.

The second option is allow u-turns on edges which don't share component IDs. E.g. if we find an edge that belongs to two different components (one for forward, one for backward), we remember this fact, merge both components into one and allow a u-turn on these edges. This would be far more involved algorithmically to make sure it works correctly, it would end up adding a minimum number of u-turns, though. Given #77, we can make them so silly expensive that they are only taken if they are the only possible choice. Drawbacks I see here is that we could end up making error tracing more difficult, since it might be hard to see where the u-turn is coming from.

oxidase commented 7 years ago

The issue is similar to #3625, at the moment there is no way to place two name ids into one edge-based node.

danpat commented 7 years ago

@MoKob Do you want to see if you can estimate the additional memory we'd need if we separated the forward/backwards edges?

It seems pretty clear that @DennisOSRM implemented things this way to save memory - names in particular are duplicated in 99.9999% of cases, and the forward/backward flag is only 1 bit.

Could we get away with another bitflag indicating whether both directions share a component ID, rather than storing the component ID itself? We might be able to find space for that in the existing structure, if it'd solve the problem.

MoKob commented 7 years ago

This morning I thought about this problem a bit more and we could do a somewhat dirty workaround to at least not fail on these requests.

Simply using one of the two component IDs, we find actual results. The snapping isn't optimal in all cases (since we would actually reach the second point), but it would at least not fail with NoRoute.

MoKob commented 7 years ago

Would love to get some feedback here.

The above method changes random NoRoute to random Snapping for the segments that are part of two components. The snapping for the destination in the above scenario changes depending on the extract run and whatever ends up representing the segment as forward segment, since we keep the forward geometry.

Positive:

less NoRoute without visible reasons
easier insight into where problems come from, since snapping ends at the road entering the component

Negative:

random snapping on extraction level (could be solved somewhat via tie-breaking: e.g. always keep the ID from the node with lexicographical smaller coordinate)
some segments become unreachable, even though they aren't technically

I'd still go for the option here, since it does not require major changes in the graph and will only fail on locations with actual data errors.

MoKob commented 7 years ago

http://map.project-osrm.org/?z=17&center=51.515049%2C-0.144249&loc=51.515186%2C-0.142307&loc=51.514692%2C-0.146116&hl=en&alt=0 might be just another case. We cannot enter the component, but can exit it. This leads to some strange routing results around the linked region.

Project-OSRM / osrm-backend

Failing to Route due to SCC component detection #3689