bcgov / ols-geocoder

Physical Address Geocoder
Apache License 2.0
10 stars 6 forks source link

Add non-standard postal element abbreviations identified in rejected address analysis #173

Open mraross opened 3 years ago

mraross commented 3 years ago

The following non-street-type abbreviations were identified in the rejected address analysis:

Abbreviation Expansion Example
GD General Delivery GD Quesnel, BC
Gerneral General Gerneral Delivery Quesnel, BC
Genral General Genral Delivery Quesnel, BC
Delivry Delivery General Delivry Quesnel, BC
Deliver Delivery General Deliver Quesnel, BC
P.O. PO P.O. Box 102, Quesnel, BC
mraross commented 3 years ago

After investigation, it was decided not to define these abbreviations as they they will cause incorrect matches or impact performance or both.

cmhodgson commented 3 years ago

This seems to relate to #170 - if garbage collection doesn't solve these, then it likely isn't enough for the correctly spelled versions, either. Also, if we want to enable spell-correction on postal elements, moving the handling to the lexer and parser would allow for that. Really, the initial postal handling was very specific, and for something like a postal code, a regex makes the most sense. However, with other changes that have been made, most or all of the postal junk logic would likely work better in the standard lexer and parser.

mraross commented 3 years ago

Agreed.