symerio / pgeocode

Postal code geocoding and distance calculation
https://pgeocode.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
231 stars 57 forks source link

City and State Lookup #31

Closed djskripta closed 1 year ago

djskripta commented 4 years ago

Partially related to #20 By no means a complete implementation, and lacks docs and unit tests. I figured i would post this as a starting point since i've already coded these features for my own project.

rth commented 4 years ago

Thanks @djskripta! We would indeed need to add associated tests to test_pgeocode.py.

One concern I have is that creating additional classes for each query will make the API a bit less readable. Can we not add this directly as Nominatim.query_city and `Nominatim.query_state methods. Or is the issue that we need to pre-compute corresponding aggregations?

We could rename _index_postal_codes to _create_index(self, kind='postal_code') and make it work for states and cities. That would would allow to re-use code as well. The fact that we would need to create all these index files when initializing Nominatim might not be too much of an issue I think.

rth commented 4 years ago

Actually I'm not sure we really need to create indexes for states and cities. That was done because postal codes were expected to be unique. For cities and particularly states, there would be multiple postal codes per query. So maybe just something along the lines,

mask = df['county_code'].str.contains(query)
return df[mask]

(with some additional string & query normalization) would be enough? It shouldn't be too bad performance wise.

azmeuk commented 3 years ago

@djskripta Do you have any intentions to solve those suggestions? May I help you with anything?

rth commented 3 years ago

@azmeuk Feel free to continue this work in a separate PR. Thanks!

rth commented 1 year ago

Thanks for the work on this PR! This is now implemented with https://github.com/symerio/pgeocode/pull/59:

>>> import pgeocode
>>> nomi = pgeocode.Nominatim('fr')
>>> nomi.query_location('Paris', col='place_name')
   country_code     postal_code place_name  ... latitude  longitude accuracy
0            FR           75000      Paris  ...  

[100 rows x 12 columns]

>>> nomi.query_location('Île-de-France', col='state_name')
...

Closing as resolved.