Open bstratto opened 3 years ago
According to Canada Post, a STN element requires a station name so the geocoder eats the following word as the name of the station. Because this is handled by a regular expression before the lexing and parsing, there is no way to know that the word being eaten is a locality. I suggest we identify how often there actually is a station name following STN, if we can figure out what that typically looks like (perhaps just a single letter as in "STN A") or if it doesn't seem to happen, then we can change the regex to not eat the next word. It might be better to handle some of the postal garbage in the parser as a specific kind of garbage that can happen anywhere, but that would have to be for a future plan.
Postal station name is often a single letter such as A but may be a word such as MAIN. Here are some examples:
PO BOX 9404 STN PROV GOVT Victoria BC PO BOX 1000 STN MAIN, Comox BC PO BOX 48810, STN BENTALL, Vancouver BC PO BOX 2083 STN TERMINAL Vancouver BC Po Box 17000 STN FORCES, Victoria, BC
Now that we have garbage pickup, maybe we don't need to identify postal elements any more. Maybe postal code and c/o would be the only exceptions.
Will be fixed by issue #174
In Geocoder 4.1, addresses with the abbreviation STN (sometimes used to mean postal station) appear to receive a penalty of LOCALITY.missing even though the locality is present.
Example: Address: 33224 FARRANT CRES GD STN, ABBOTSFORD, BC 4.1: 33224 Farrant Cres, Abbotsford, BC Score: 89, Pecision: 100 Faults: "/PJ" POSTAL_ADDRESS_ELEMENT.notAllowed 1 "" LOCALITY.missing 10
Address: 2083 CLEARBROOK RD STN, ABBOTSFORD, BC 4.1: 2083 Clearbrook Rd, Abbotsford, BC Score: 89 Precision: 99 Faults: "/PJ" POSTAL_ADDRESS_ELEMENT.notAllowed 1 "" LOCALITY.missing 10
Address: 3191 GOLDFINCH ST STN, ABBOTSFORD, BC 4.1: 3191 Goldfinch St, Abbotsford, BC Score: 89 Precision: 100 Faults: "/PJ" POSTAL_ADDRESS_ELEMENT.notAllowed 1 "" LOCALITY.missing 10