Lately I've been running into his error a few times: OSError: [Errno 22] Invalid argument.
It happens when processing dask_geopandas.GeoDataFrame while having defined a dask.distributed.Client in advance. It doesn't happen when no client is defined, i.e, dask_geopandas.read_file(fp, npartitions=30) straight away, without defining a client. Also, it seems that this error doesn't happen when the cluster is defined as Client(processes=False).
So it seems to be the combination of a dask.distributed.Client with dask_geopandas computations on dataframes that hold geometry data.
It's a bit tricky to provide a reproducable example, as this typically happens with larger datasets. But I think/hope that by describing the problem someone will be able to guide me in the right direction. If you want to reproduce this error, please find a link to the data here.
Lately I've been running into his error a few times:
OSError: [Errno 22] Invalid argument
.It happens when processing
dask_geopandas.GeoDataFrame
while having defined adask.distributed.Client
in advance. It doesn't happen when no client is defined, i.e,dask_geopandas.read_file(fp, npartitions=30)
straight away, without defining a client. Also, it seems that this error doesn't happen when the cluster is defined asClient(processes=False)
.So it seems to be the combination of a
dask.distributed.Client
withdask_geopandas
computations on dataframes that hold geometry data.It's a bit tricky to provide a reproducable example, as this typically happens with larger datasets. But I think/hope that by describing the problem someone will be able to guide me in the right direction. If you want to reproduce this error, please find a link to the data here.
Commands to download the data ~2GB