We've already opened two PRs to potentially solve this problem
https://github.com/pelias/geonames/pull/372 addresses this by modifying the names of Geonames records, and removing 'City of' and 'Town of' style prefixes, solving the problem at index time
There are advantages and tradeoffs to both approaches. As @missinglink mentioned in https://github.com/pelias/geonames/pull/372#issuecomment-538939148, a good policy is to avoid making major modifications to data from our upstream datasets where possible. This means query-time deduplication is the preferred solution.
This is a new issue to document an old problem: Geonames records often have prefixes on the names of cities, leading to duplicate results.
Examples
https://pelias.github.io/compare/#/v1/autocomplete?text=philadelphia
https://pelias.github.io/compare/#/v1/autocomplete%3Ftext=new%20york
Solutions
We've already opened two PRs to potentially solve this problem
https://github.com/pelias/geonames/pull/372 addresses this by modifying the names of Geonames records, and removing 'City of' and 'Town of' style prefixes, solving the problem at index time
https://github.com/pelias/api/pull/1371 on the other hand, improves API deduplication to handle these cases at query time
There are advantages and tradeoffs to both approaches. As @missinglink mentioned in https://github.com/pelias/geonames/pull/372#issuecomment-538939148, a good policy is to avoid making major modifications to data from our upstream datasets where possible. This means query-time deduplication is the preferred solution.