oturns / geosnap

The Geospatial Neighborhood Analysis Package
https://oturns.github.io/geosnap-guide
BSD 3-Clause "New" or "Revised" License
243 stars 32 forks source link

consider pygris #351

Closed knaaptime closed 1 year ago

knaaptime commented 1 year ago

once @walkerke's pygris hits conda-forge, should consider whether its useful to outsource the get_lodes function and maybe some of the other census stuff too (cbsas and the like).

The motivation here is that pygris has a caching layer thats essentially the same as geosnap's dataset class, except it caches shapefiles instead of parquet and falls back to the census server when locals arent available (insead of our S3 server). I dont want to lose the performance of parquet, but swapping over to pygris would mean we dont have to maintain geoms in the S3 bucket and could instead just keep stuff like the rebuilt demographic profiles that are way to big to stream from the census

jGaboardi commented 1 year ago

I have already used pygris for some work stuff and have had a nice experience.

walkerke commented 1 year ago

Happy to help with any integration, and thanks for having pygris in mind! (Glad it's working well for you too @jGaboardi!).

I'll get moving on publishing to conda-forge. Regarding get_lodes() - take a look and see if it's robust enough for what you need (I wrote the functions in the data module mostly for my own convenience).

knaaptime commented 1 year ago

😆 with you there... the DataStore class here exists mostly for my own research

knaaptime commented 1 year ago

I'd love to eventually rework the internals of DataStore, but i think processing the demographic profile gdbs from census, storing them as parquet, and serving from s3 is still going to be the best way forward for now. Closing unless we need to revisit