Closed BryanChuinkam closed 5 years ago
@BryanChuinkam thanks for the report.
If one looks at the GeoNames postal codes data for Canada (see the CA.zip) it does only contain 3 letter postal codes. However in the pgeocode implementation https://github.com/symerio/pgeocode/blob/5ae2fcfc8a2a796a9854e01aba512d5f7b60a78f/pgeocode.py#L125
the postal code for Canada needs to be provided under the form "XXX YYY" and the first part will be discarded. I don't remeber why this was added, but I imagine it aimed to to match the wikipedia defintion https://en.wikipedia.org/wiki/Postal_codes_in_Canada#Components_of_a_postal_code .
In your experience Canadian postal codes are mostly 3 letters/digits then? We could change the way this is handled.
@rth I think that line above need to change so only the first part of postal codes are considered.
https://github.com/symerio/pgeocode/blob/5ae2fcfc8a2a796a9854e01aba512d5f7b60a78f/pgeocode.py#L125
For example, M4B 1B3 is a postal code in Toronto, Ontario. 1B3 is not a valid code but M4B is.
@rth I think that line above need to change so only the first part of postal codes are considered.
https://github.com/symerio/pgeocode/blob/5ae2fcfc8a2a796a9854e01aba512d5f7b60a78f/pgeocode.py#L125
For example, M4B 1B3 is a postal code in Toronto, Ontario. 1B3 is not a valid code but M4B is.
You are right. We should change codes['postal_code'] = codes.postal_code.str.split().str.get(1) to codes['postal_code'] = codes.postal_code.str.split().str.get(0)
Thanks @musicpiano ! Would you like to make a pull request?
Regarding the query_postal_code function: As seen on the code below. The actual postal code i'm searching for is 'K2C' but in-order to search for it i had to insert '5CA' in front of it. My understanding is the 5 - represents the accuracy and CA - is the Country code. Why do these first three characters need to be added? Am i right in what they represent?
nomi = pgeocode.Nominatim('CA') nomi.query_postal_code("5CA K2C")
thanks