pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
221 stars 162 forks source link

Do not deduplicate US States #1614

Closed orangejulius closed 2 years ago

orangejulius commented 2 years ago

It's common for US states to have either a county or city within them that shares a name (with minor possible differences).

For example:

Our general deduplication logic considers admin records that are parented by a record sharing its name to be the same. This works well for places like Singapore, Berlin, and Tokyo, which all have a city (locality) that is conceptually the same as a region or country.

US states, however, are not conceptually the same as any of these cities, and they should pretty much always show up in results.

This PR adds a check that the record is not a US state before performing the general hierarchy checks. There's one exception: US states can dedupe against other US states, so that Geonames and WOF records can deduplicate themselves.

This PR allows us to merge https://github.com/pelias/api/pull/1371