pelias / openaddresses

Pelias import pipeline for OpenAddresses.
MIT License
51 stars 43 forks source link

mapper stream to separate concatenated unit numbers #502

Closed missinglink closed 2 years ago

missinglink commented 2 years ago

note: branched from https://github.com/pelias/openaddresses/pull/500

as noted in https://github.com/pelias/openaddresses/issues/499 some sources have the unit and housenumber concatenated.

this PR solves the issue for the most common cases in AU/NZ and opens up the possibility for expansion in other countries.

closes https://github.com/pelias/openaddresses/issues/499

missinglink commented 2 years ago

this PR will be particularly helpful in Australia, currently we're seeing results like this: https://pelias.github.io/compare/#/v1/search?text=10+brunswick+st%2C+fitzroy&debug=1

missinglink commented 2 years ago

This looks good-to-go, the results for /v1/search are markedly improved:

Screenshot 2022-02-07 at 13 53 34

Additionally the GNAF mapper is still working as expected:

Screenshot 2022-02-07 at 13 54 38

Worth noting that the results for /v1/autocomplete are still not great:

Screenshot 2022-02-07 at 13 54 02

This is mainly due to a decision I made to not modify the name.default (which is trivial to do). The tradeoff here is that we would get similar results to /v1/search at the cost of the unit number not being included in the label, and therefore making visual deduplication more difficult.

I suggest we merge this as it's only positive and discuss a new ticket where we modify the name.default for these AU/NZ records in such a way that it doesn't include the unit designation, we can then modify the label generator to prepend in the unit number to the label in a locale-aware way.

missinglink commented 2 years ago

acceptance tests looking :+1:, merging this.