intake / intake_geopandas

An intake plugin for loading datasets with geopandas
BSD 2-Clause "Simplified" License
15 stars 7 forks source link

to_dask? #34

Closed raybellwaves closed 2 years ago

raybellwaves commented 2 years ago

Currently to_dask isn't implemented: https://github.com/intake/intake_geopandas/blob/master/intake_geopandas/geopandas.py#L52

dask_geopandas can read a geoparquet lazily:

import geopandas as gpd

gdf = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.to_parquet("gdf.parquet")

import dask_geopandas as dgpd

dgdf = dgpd.read_parquet("gdf.parquet")

Given dask_geopandas is experimental it may not be well testing for other data formats. Therefore, it probably should not be used as a driver throughout but only when to_dask is called.

martindurant commented 2 years ago

I think using geopandas optionally for to_dask is totally reasonable. It should not be imported for non-dask use. I'm uncertain whether it makes more sense to put this logic in the one driver or to make two - either should be pretty simple.

ian-r-rose commented 2 years ago

I'd be in favor of using it optionally with to_dask as well. Since geoparquet is the best-supported (only supported?) format at the moment, I don't thing there is much downside to just implementing in GeoParquetSource.

blackary commented 2 years ago

@raybellwaves Wanna close this issue now?

raybellwaves commented 2 years ago

Nice maintenance work @blackary!