holoviz / spatialpandas

Pandas extension arrays for spatial/geometric operations
BSD 2-Clause "Simplified" License
308 stars 25 forks source link

Error on running overview.ipynb a Kubernetes cluster #31

Closed XingongLi closed 4 years ago

XingongLi commented 4 years ago

I was running the notebook on the ocean.pangeo.io deployment on GCP. I modified the cluster as bellow:

from dask.distributed import Client, progress

from dask_kubernetes import KubeCluster cluster = KubeCluster(n_workers=16) cluster

client = Client(cluster) client

When I ran the next cell of code: `import pandas as pd import dask.dataframe as dd reps = 10000

Large geopandas GeoDataFrame

cities_large_gp = pd.concat([cities_gp] * reps, axis=0)

Large spatialpandas GeoDataFrame

cities_large_df = pd.concat([cities_df] * reps, axis=0)

Large spatialpandas DaskGeoDataFrame with 16 partitions

cities_large_ddf = dd.from_pandas(cities_large_df, npartitions=16).persist()

Precompute the partition-level spatial index

cities_large_ddf.partition_sindex`

I got the following errors:


KilledWorker Traceback (most recent call last)

in 12 13 # Precompute the partition-level spatial index ---> 14 cities_large_ddf.partition_sindex /srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_sindex(self) 145 geometry._partition_bounds = self._partition_bounds[geometry_name] 146 --> 147 self._partition_sindex[geometry.name] = geometry.partition_sindex 148 self._partition_bounds[geometry_name] = geometry.partition_bounds 149 return self._partition_sindex[geometry_name] /srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_sindex(self) 66 def partition_sindex(self): 67 if self._partition_sindex is None: ---> 68 self._partition_sindex = HilbertRtree(self.partition_bounds.values) 69 return self._partition_sindex 70 /srv/conda/envs/notebook/lib/python3.7/site-packages/spatialpandas/dask.py in partition_bounds(self) 48 if self._partition_bounds is None: 49 self._partition_bounds = self.map_partitions( ---> 50 lambda s: pd.DataFrame( 51 [s.total_bounds], columns=['x0', 'y0', 'x1', 'y1'] 52 ) /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/base.py in compute(self, **kwargs) 163 dask.base.compute 164 """ --> 165 (result,) = compute(self, traverse=False, **kwargs) 166 return result 167 /srv/conda/envs/notebook/lib/python3.7/site-packages/dask/base.py in compute(*args, **kwargs) 434 keys = [x.__dask_keys__() for x in collections] 435 postcomputes = [x.__dask_postcompute__() for x in collections] --> 436 results = schedule(dsk, keys, **kwargs) 437 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) 438 /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs) 2571 should_rejoin = False 2572 try: -> 2573 results = self.gather(packed, asynchronous=asynchronous, direct=direct) 2574 finally: 2575 for f in futures.values(): /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous) 1871 direct=direct, 1872 local_worker=local_worker, -> 1873 asynchronous=asynchronous, 1874 ) 1875 /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs) 766 else: 767 return sync( --> 768 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs 769 ) 770 /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs) 332 if error[0]: 333 typ, exc, tb = error[0] --> 334 raise exc.with_traceback(tb) 335 else: 336 return result[0] /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in f() 316 if callback_timeout is not None: 317 future = gen.with_timeout(timedelta(seconds=callback_timeout), future) --> 318 result[0] = yield future 319 except Exception as exc: 320 error[0] = sys.exc_info() /srv/conda/envs/notebook/lib/python3.7/site-packages/tornado/gen.py in run(self) 733 734 try: --> 735 value = future.result() 736 except Exception: 737 exc_info = sys.exc_info() /srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker) 1727 exc = CancelledError(key) 1728 else: -> 1729 raise exception.with_traceback(traceback) 1730 raise exc 1731 if errors == "skip": KilledWorker: ("('from_pandas-0638bd8b066e3449279e712ce7cd8a44', 9)", )
jbednar commented 4 years ago

I can't tell from that output why it was killed, but hopefully someone else can!

jbednar commented 4 years ago

Closing as I don't think there's anything we can do to debug it with that information; it's not a reproducible example. If you have something we can run and debug, please post that here and reopen!