valhalla / valhalla

Open Source Routing Engine for OpenStreetMap
https://valhalla.github.io/valhalla/
Other
4.47k stars 678 forks source link

Directions with Landmark Support #4037

Open kevinkreiser opened 1 year ago

kevinkreiser commented 1 year ago

ref: #3999

What is it and why do it?

Simply put, landmarks can be used to enhance a route and its directions with information about the routes surroundings. Landmarks could be almost anything that exists in the physical world but typically they are points of interest (POIs) such as a restaurant or gas station or perhaps a monument or statue. The why we want to do it is simillarly simple. In certain circumstances it can be advantageous to describe the route in terms of nearby POIs instead of just streets names/exit numbers/compass directions etc.

What follows are some key tenets of landmark use in directions. They are not absolutes but are merely something to keep in mind as we build this functionality.

Visibility

The best, or most useful, landmarks are those that are easily visible from the road/path. Visibility is important because landmarks are most useful in the context of giving directions. For example, someone might ask you for directions, and to make it easier on them, you could say:

go about 5 miles and turn left at Steve's gas station

It wouldn't make sense for you to say

go about 5 miles and turn left at the drainage ditch

The ditch is not the most salient thing you notice at that particular intersection. Interestingly, you also left out the road name. Do you even remember the roads name? Maybe not, but you definitely notice the giant gas station on that corner.

Verbosity

In the example above you also took care to only mention what was needed, you could have, but didn't say:

go about 5 miles, you'll have passed a barber shop, a haunted house, 2 bus stops, and a shoe factory before turning left at Steve's gas station

Dumping all the information about all the landmarks one might notice along the way would be highly confusing. Instead you want the person to be on the lookout for a single landmark near the next maneuver point. Keeping the mental burdon of following directions low, by reducing the number of things one has to remember, makes it easier to do the right thing at the decision points in the route.

Specificity

Notice we tried to be specific with the landmark, we didn't say:

go about 5 miles and turn left at a gas station

There may have been many gas stations along that 5 mile stretch. It was advantageous to be as specific about which one it was as possible. The whole reason we are using landmarks to describe the route is to add clarity, not further ambiguity. In this case we were doubly lucky though in that the name of the POI also had the type of POI embedded in it. They now know its called "Steve's" so that should be written somewhere but also they know what a gas station looks like which makes it easier for them to filter down what they are looking at by type. In other words they only need to consider looking at gas stations to see what name is written on them not every landmark they see.

How do we enable it?

Over in another issue a question was asked about low_emmission_zones, that is, marking the graph with overlay information about where it lies spatially with respect to some polygonal dataset; i.e. which parts of the graph are in and not in these zones/polygons. Indeed we do this already with administrative areas; we know what country and state a given node of the graph lies in. So we have a very specific case of parsing out some non-graph osm data and applying it back to the graph as an "enhancement."

Over in that issue I had suggested perhaps we could make this whole concept more generic, by simply parsing out A(rea)OIs and optionally enhancing/marking the graph with them. Admins processing exists as a standalone binary but we could revamp it to allow other POIs and AOIs to be parsed out and exported to fulfill new use-cases like landmark routing or potentially many other use-cases.

The alternative exists to give landmarks processing its own path through the data pipeline but we should seriously consider avoiding that in favor of making future (enhancements) work easier.

Data

Structure of the data (what info do we need about a landmark)

At the very least we are going to need a lat,lon location of each landmark, a name and a type. As far as types go the lat,lon fits in a 32bit integer to 7 digits of precision, the name will need to be variable length text stored as we store all other text things (an index to a nameinfo object) and the type can be an enum.

Where can we store it?

We could store it in nodeinfos and make use of the signs data structure to store the name information. Signs also have a type though we should check that it would completely fit our needs.

Another option would be storing it in edges via edgeinfo and making use of the tag/value pairs structure to store both type and the name.

Yet another option would be to store it in a completely novel way. What that exactly is needs some thought but we'll keep the option open.

What requirements does a specific storage option need to fulfill?

The primary thing we need to get out of this landmark data is location with respect to the graph geometry. We dont want to scan the whole tileset to see if there are any landmarks near a certain part of the route. The best would be to recover the landmarks along the path via direct references from the edges and nodes.

In the option where we tie them to a nodeinfo we can simply check the nodes in the route to see if it passes any landmarks (whether a node has a landmark sign or not), in essence we pre-correlate the landmarks to the graph in the enhancer stage. This has the benefit of being very compact compared the the edgeinfo strategy in that we dont have to duplicate it to all potential edges, since nodes are shared by all the edges that connect to them. The problem of digitization though makes it so that, eg. doubly digitized intersections may put a landmark on only one of the nodes in the intersection (the closest i guess?) and so paths that go through the intersection but not that node will not get a chance to use that particular landmark in their directions. This might be ok, maybe you dont want to talk about landmarks on the other side of a doubly digitized edge or we might detect internal edges and put the sign on all the nodes in the intersection (though then you have deduplication to worry about).

In the edgeinfo option it is similarly easy to just grab the landmark while processing the route so no trouble there. The problem here is that we duplicate information on multiple edges to be sure we can use the landmark where we want to. We also need to know where along the edge the landmark is. Some could be located in the middle of the edge, which might be useful but since no maneuver can happen there maybe it isn't useful?

In the mostly unspecified final option of some new way to store them (a new structure i guess), the primary thing we would gain there is the ability to store a more complex representation of the geometry. For selfish reasons, I'd love to see valhalla store actual polygons at some point and I'd love to actually finish the idea of an on-the-fly MVT transcoder for valhalla's tileset. This is all a separate project, but keeping our mind open on how we integration landmark info could also keep our options open for such a project in the future. Anyway such a schema would have us storing points an polygons as generic graph metadata and require us to include a connection point so that they could be associated to the graph (this would be as simple as a graph id). We would somehow also need an inverse lookup so that when we have a path we could also pull the relevant metadata, thats the hard part. I suppose we could still use node/edgeinfos to store ids of these metadata.

Surfacing the Info

Maneuver Generation

If any of the edges/nodes/etc in or near the maneuver point have landmarks tied to them we should then associate them to the maneuver itself. We'll need to know of course where the landmark lies with respect to the maneuver point. Here we will need some heuristics. Perhaps we should only include those landmarks that are on the side of the maneuver to which the maneuver is turning. E.g. If we are turning right we probably dont want to use a landmark that is on the left.

Narrative Generation

If we've tied a landmark to specific maneuvers (maybe we start with the simple ones like slight/regular/sharp left/right) then we can offset into the phrase list and pick a different phrase that has the narrative replacement for the landmark name.

Serialization

We may or may not need to do anything here. It would be cool to serialize the landmark as part of the maneuver apart from the narrative that we'll already have. This could be useful for display for example if we had the exact location or geometry of the landmark. Of course we could totally forgo this and rely solely on the narrative itself (totally ok to do that).

Implementation

kevinkreiser commented 1 year ago

@dgearhart @gknisely if you have strong feelings about this (i know i havent fully fleshed it out yet anyway), please let me know so i can include them in the project/implementation plan