symerio / pgeocode

Postal code geocoding and distance calculation
https://pgeocode.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
238 stars 58 forks source link

Support alternate download locations #8

Closed rth closed 4 years ago

rth commented 5 years ago

It might be useful to support alternate download locations in case GeoNames website goes down. This would also help reproducibility (I'm not sure how often GeoNames database is updated and if that is tracked somewhere).

This would require storing the data somewhere. One possibility for free hosting could be to attach it to Github releases.

For instance, maybe @zaro's implementation in https://github.com/zaro/pgeocode/commit/6a3c743bee8fd67ae6ec82c87e0d6cbfefa62110 could be adapted.

shawn-dyjak-pp commented 5 years ago

Similar to this, the download directory does not work in AWS serverless. While the constant download of the zip wouldn't be the best, it is an easy fix:

if os.environ.get("AWS_EXECUTION_ENV") is not None:
  STORAGE_DIR = '/tmp/pgeocode_data'
else:
  STORAGE_DIR = os.path.join(os.path.expanduser('~'),
                           'pgeocode_data')
rth commented 5 years ago

@shawn-dyjak-pp I suppose you can set an environment variable on AWS lambda right? In that case, something like,

STORAGE_DIR = os.environ.get(
     "PGEOCODE_DATA_DIR",
     os.path.join(os.path.expanduser('~'), 'pgeocode_data')
)

should work. Better to avoid AWS specific environement variable names. A Pull request to add this would be very welcome.

shawn-dyjak-pp commented 5 years ago

You can. Tested and confirmed that would work as would be expected. I'll get the PR in a bit.

@rth How do you prefer a PR? I don't have permissions to create a new branch.

rth commented 5 years ago

How do you prefer a PR? I don't have permissions to create a new branch.

@shawn-dyjak-pp The best way is to fork this repo, create a new branch there and then create a PR from it https://help.github.com/en/articles/creating-a-pull-request-from-a-fork