geopandas / dask-geopandas

Parallel GeoPandas with Dask
https://dask-geopandas.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
488 stars 45 forks source link

Pickling DaskGeoDataFrame loses `spatial_partitions` #237

Closed TomAugspurger closed 1 year ago

TomAugspurger commented 1 year ago

This is a bit of a strange use-case, but roundtripping a GeoDataFrame through pickle loses the spatial_partitions property:

import dask_geopandas
import geopandas.datasets
import pickle

df = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
ddf = dask_geopandas.from_geopandas(df, npartitions=2)
print(ddf.spatial_partitions)  # None
ddf2 = pickle.loads(pickle.dumps(df))
print(ddf2.spatial_partitions)  # AttributeError

__init__ doesn't run, so it's never set at https://github.com/geopandas/dask-geopandas/blob/bc3af63aad32f939a0ac44950d39cc0f2da3b614/dask_geopandas/core.py#L67.

Maybe adding spatial partitions to _args to mirror dask.dataframe._Frame like in https://github.com/dask/dask/blob/main/dask/dataframe/core.py#L233-L241 would do the trick.