pelias / openstreetmap

Import pipeline for OSM in to Pelias
MIT License
112 stars 72 forks source link

tag_mapper: contract english diagonals for street addresses #560

Closed missinglink closed 3 years ago

missinglink commented 3 years ago

Some addresses are failing to deduplicate because of minor differences in the expansion/contraction of diagonals:

Screenshot 2021-08-20 at 12 45 00

This PR is very basic, it simply replaces the expanded form of these diagonals (in English only) where found in a street field.

We have a more advanced strategy for openaddresses in https://github.com/pelias/openaddresses/pull/477 which I'd like to port over to this repo too at some point, one difficulty with that is detecting the language appropriately.

This PR is a very simple fix in the interim.

The pelias/schema codebase has synonym mappings which will allow matching to work in both forms.