intake / intake_geopandas

An intake plugin for loading datasets with geopandas
BSD 2-Clause "Simplified" License
15 stars 7 forks source link

Wont work with non-public urls #13

Open Casyfill opened 4 years ago

Casyfill commented 4 years ago

It seems that, on the contrast to intake itself, this plugin does not allow pulling data from the hon-public s3 buckets (for example), given that it uses naive geopandas.read_file, right?

As a solution, if would be probably nice to use provided fs object from intake, right? Will be happy to PR

ian-r-rose commented 4 years ago

I regularly use this to read from private s3 buckets, though it presumes that you have your AWS credentials set up in your environment (either via env variables or a ~/.aws directory).

It does indeed use geopandas.read_file, which in turn uses fiona, which in turn uses ogr, which supports reading from s3. Since this is fundamentally different from the fs approach, I think there might be a challenge making it work for an arbitrary fsspec implementation, though I'd be happy to be proven wrong.

Are you proposing an alternate route besides using geopandas.read_file?

Casyfill commented 4 years ago

Indeed, just went down this rabbit hole to all the way to ogr. The problem for me is that we use role assumptions, so static credentials won't work for me.

I will try exposing credentials before pulling, thanks!

ian-r-rose commented 4 years ago

If there are things that OGR can't handle, we might provide an fsspec aware code path here that downloads a temporary file and then loads that locally. I'm not very familiar with using role assumptions, so I don't know if there is a way to do this at present (here are the relevant docs).