symerio / pgeocode

Postal code geocoding and distance calculation
https://pgeocode.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
231 stars 57 forks source link

Add a fallback data source when https://download.geonames.org/export/zip/{country}.zip fails #41

Closed purusmahe closed 3 years ago

purusmahe commented 4 years ago

With issues #40 and #34 , we see that genonames data source that we rely on seems to have intermittent availability issues. To protect against such intermittent failures can we think of adding a fallback data source or a cache. Both the issues were short lived and resolved by itself.

In Issue #40 @richunger pointed out that the data was available at http://download.geonames.org/export/dump/ when we had the issue, can we rely on that as a backup data source.

rth commented 4 years ago

Thanks for opening this issue. The issue with http://download.geonames.org/export/dump is that last time I checked it didn't actually have files with exactly the same format as https://download.geonames.org/export/zip/ so we can't use it as a cache.

I think a better solution would be to store these files in a separate github repo, and then serve them via Github Pages as an alternate location. This would have the added advantage,

I have actually started https://github.com/symerio/postal-codes-data last time this happened but I think it may have files from the wrong URL. Also GB_full is too large to be added on Github unfortunately.

So what we would need is,

  1. a Github action on https://github.com/symerio/postal-codes-data to periodically download data from geonames.org and put it into this repo
  2. a mechanism in pgeocode to specify alternate URL locations.

Would anyone be interested in looking into it?

dshinzie commented 4 years ago

+1 on this. I am continually getting 404s for this package. Is this issue being looked at?

rth commented 3 years ago

Some of the points from my above message were implemented in https://github.com/symerio/postal-codes-data. The fallback URL should then be,

https://symerio.github.io/postal-codes-data/data/geonames/<country-code>.txt