It's common for US states to have either a county or city within them that shares a name (with minor possible differences).
For example:
Arkansas City in Arkansas
Hawaii County in Hawaii
Iowa City in Iowa
California City in California
Utah County in Utah
Nebraska City in Nebraska
Idaho City and Idaho County in Idaho (Idaho City is not in Idaho County)
Maryland City, in Maryland
Minnesota City in Minnesota
New York City in New York (of course!)
Our general deduplication logic considers admin records that are parented by a record sharing its name to be the same. This works well for places like Singapore, Berlin, and Tokyo, which all have a city (locality) that is conceptually the same as a region or country.
US states, however, are not conceptually the same as any of these cities, and they should pretty much always show up in results.
This PR adds a check that the record is not a US state before performing the general hierarchy checks. There's one exception: US states can dedupe against other US states, so that Geonames and WOF records can deduplicate themselves.
It's common for US states to have either a county or city within them that shares a name (with minor possible differences).
For example:
Our general deduplication logic considers admin records that are parented by a record sharing its name to be the same. This works well for places like Singapore, Berlin, and Tokyo, which all have a city (locality) that is conceptually the same as a region or country.
US states, however, are not conceptually the same as any of these cities, and they should pretty much always show up in results.
This PR adds a check that the record is not a US state before performing the general hierarchy checks. There's one exception: US states can dedupe against other US states, so that Geonames and WOF records can deduplicate themselves.
This PR allows us to merge https://github.com/pelias/api/pull/1371