osm-search / Nominatim

Open Source search based on OpenStreetMap data
https://nominatim.org
GNU General Public License v3.0
3.07k stars 711 forks source link

Include 'branch' tag as part of index #1161

Open JaLooNz opened 6 years ago

JaLooNz commented 6 years ago

It is currently difficult to search for a shop with the same name but different branch locations, as the branch location is not included as part of the search index.

The proposal is to include 'branch' and their localised names 'branch:en' into the main index. This should resolve situations where by searching McDonalds with a corresponding branch location fails to locate the shop.

lonvia commented 5 years ago

Having looked through a couple of examples of the tag, the current use seems to be mostly a repetition of the address (i.e. the city, suburb). Nominatim already has this information, so that adding 'branch' would only be of very limited use. We might add it in the same way as other 'addr:*' tags but given the high rate of misuses of the tag (a lot of mixups with 'brand'), that might do more harm than good.

JaLooNz commented 5 years ago

While it is true that branch may typically be a repetition of the city/suburb names, there are also instances whereby this tag is a much more detailed description and pointer to the specific store within the city/suburb.

In terms of geological usefulness, this tag is a good tag to distinguish between multiple stores within in the vicinity, i.e. McDonalds (City X-A Store) or McDonalds (City X-B Store) where without this tag, you can only roughly find the McDonalds within the vicinity of a certain city location (i.e. City X) but not information on (A/B store).

nicolasmaia commented 5 years ago

I was about to open a similar issue when I found this. It would be a very useful thing to have.

1ec5 commented 2 years ago

Having looked through a couple of examples of the tag, the current use seems to be mostly a repetition of the address (i.e. the city, suburb). Nominatim already has this information, so that adding 'branch' would only be of very limited use. We might add it in the same way as other 'addr:*' tags but given the high rate of misuses of the tag (a lot of mixups with 'brand'), that might do more harm than good.

A branch name is very commonly a geographical designator, but it’s much less formal than other address tags. To illustrate the difficulty, PostalAnnex Silicon Valley has branch=Silicon Valley. It’s one of several of that chain’s locations in Silicon Valley, but it’s debatable whether there should even be a Silicon Valley place node in OSM because that’s a very informal nickname for a poorly defined area. Besides a place name, branch could contain things like the street name, the owner’s name, the previous owner’s firstborn child’s name, the name of a nearby attraction (for SEO purposes), or the name of the heritage building the POI occupies.

All of this suggests to me that a fully qualified name like “PostalAnnex Silicon Valley” would be more readily indexable than an isolated branch, because it could be treated the same way as official_name. In fact, this example is tagged with official_name, but some mappers may not be tagging the fully qualified name as is, perceiving it as redundant to the branch key. A naïve approach would be to always synthesize a fully qualified name to index by concatenating brand and branch. Unfortunately, there are a lot of different ways to combine the two keys, even in English, such as “branch brand” “brand of branch”, and “branch’s brand”. In the absence of fuzzy searching (#759), Nominatim would have to index a number of forms. It would get even more complicated in a language that features declension.

In my opinion, some name-related key should be set to the fully qualified name, unstructured, so that geocoders like Nominatim wouldn’t have to guess. There’s a spirited discussion in osmlab/name-suggestion-index#5500 about whether that key should be name, but it doesn’t have to be name in order for Nominatim to index it. It’d be great if Nominatim could find a way to use an isolated branch, but it seems like a tough problem. Imagine if OSM only tagged road names in structured form, like in most U.S. databases: a geocoder could index some very clever variants automatically, but only by making language-specific assumptions that many street names would violate.

lonvia commented 2 years ago

That sums up the issues very well, @1ec5.

One of the issues with combinations is that Nominatim should somehow know that the term must always appear with another. If we started indexing brand now as any other name then you would get the post office as a result when searching for 'sillicon valley'. It would even be ranked quite high because it is a perfect name match. In reality, it is very unlikely that this was the post office you were looking for. If you do, you'd rather search for 'postalannex sillicon valley' or 'sillicon valley post' or something similar. So that makes the brand kind of a secondary name that should only be used, when the object would also match already against the other parts of the query. So, it should match when you already state that your looking for a post office or you already state that you look for something on Homestead Road. However it shouldn't match when you are looking for something in California.

This is not a problem particularly related to brand names. We already have it with POIs named for places. The search for 'sillicon valley' currently brings up bus stops, electronic stores and the Carnegie Mellon branch in Sillicon valley. Adding branch tags now to Nominatim would make the situation worse precisely because they largely would end up causing false matches against place searches.

So, somehow Nominatim needs to get a notion of such a "secondary" or "branch" name first before it can start indexing branch. This needs some more thinking on how to handle it on the search side but also how to distinguish primary and secondary names in OSM tagging. To that end, I certainly welcome any tagging that helps with that. Please keep the branch tag.

lonvia commented 4 months ago

There might be a really simple solution to this: make branch equivalent to an address component.