geopandas / dask-geopandas

Parallel GeoPandas with Dask
https://dask-geopandas.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
488 stars 45 forks source link

Discussion about use case with xarray? #209

Open apurba-biswas opened 2 years ago

apurba-biswas commented 2 years ago

I have a question regarding the use case of xarray + geopandas VS xarray + dask-geopandas. I'm slightly confused if I need to leverage dask-geopandas if I already have xarray being dask-efied.

If I have a large raster file (e.g. ~ 2GB) loaded using xarray, and am trying to compute statistics on a small number of coarse geometries (a small geodataframe), for example, computing US-state zonal statistics (50 state boundaries), I imagine geopandas would be sufficient in this case?

I imagine dask-geopandas would be helpful in the "high-detailed" case i.e. ~100000 small detailed geometries.

martinfleis commented 2 years ago

I would recommend using vanilla geopandas in this case. I am not aware of a direct interface between dask-geopandas and dask-backed xarray. To make a use of it you would likely need to partition geodataframe along the existing partitions of xarray and let the zonal stats method know how to access such an information. It is surely a direction worth exploring but most likely not ready yet.

BENR0 commented 1 year ago

@apurba-biswas how are you computing the zonal statistics with a geopandas dataframe and xarray?