holoviz / spatialpandas

Pandas extension arrays for spatial/geometric operations
BSD 2-Clause "Simplified" License
305 stars 24 forks source link

`RecursionError` After Pandas 2.1.0 Release #124

Closed philipc2 closed 10 months ago

philipc2 commented 10 months ago

Description of expected behavior and the observed behavior

After the Pandas 2.1.0 release, construction of GeoDataFrames runs into a RecursionError

(spatialpandas-pandas-210) bash-4.2$ conda list # packages in environment at /glade/work/philipc/conda-envs/spatialpandas-pandas-210: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge asttokens 2.2.1 pyhd8ed1ab_0 conda-forge aws-c-auth 0.7.3 he2921ad_3 conda-forge aws-c-cal 0.6.2 hc309b26_0 conda-forge aws-c-common 0.9.0 hd590300_0 conda-forge aws-c-compression 0.2.17 h4d4d85c_2 conda-forge aws-c-event-stream 0.3.2 h2e3709c_0 conda-forge aws-c-http 0.7.12 hc865f51_1 conda-forge aws-c-io 0.13.32 h019f825_2 conda-forge aws-c-mqtt 0.9.5 h3a0376c_1 conda-forge aws-c-s3 0.3.14 h1678ad6_3 conda-forge aws-c-sdkutils 0.1.12 h4d4d85c_1 conda-forge aws-checksums 0.1.17 h4d4d85c_1 conda-forge aws-crt-cpp 0.23.0 h40cdbb9_5 conda-forge aws-sdk-cpp 1.10.57 h6f6b8fa_21 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 pyhd8ed1ab_3 conda-forge backports.functools_lru_cache 1.6.5 pyhd8ed1ab_0 conda-forge bokeh 3.2.2 pyhd8ed1ab_0 conda-forge brotli-python 1.0.9 py311ha362b79_9 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.19.1 hd590300_0 conda-forge ca-certificates 2023.7.22 hbcca054_0 conda-forge click 8.1.7 unix_pyh707e725_0 conda-forge cloudpickle 2.2.1 pyhd8ed1ab_0 conda-forge comm 0.1.4 pyhd8ed1ab_0 conda-forge contourpy 1.1.0 py311h9547e67_0 conda-forge cytoolz 0.12.2 py311h459d7ec_0 conda-forge dask 2023.8.1 pyhd8ed1ab_0 conda-forge dask-core 2023.8.1 pyhd8ed1ab_0 conda-forge debugpy 1.6.8 py311hb755f60_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge distributed 2023.8.1 pyhd8ed1ab_0 conda-forge executing 1.2.0 pyhd8ed1ab_0 conda-forge freetype 2.12.1 hca18f0e_1 conda-forge fsspec 2023.6.0 pyh1a96a4e_0 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge glog 0.6.0 h6f12383_0 conda-forge importlib-metadata 6.8.0 pyha770c72_0 conda-forge importlib_metadata 6.8.0 hd8ed1ab_0 conda-forge ipykernel 6.25.1 pyh71e2992_0 conda-forge ipython 8.14.0 pyh41d4057_0 conda-forge jedi 0.19.0 pyhd8ed1ab_0 conda-forge jinja2 3.1.2 pyhd8ed1ab_1 conda-forge jupyter_client 8.3.1 pyhd8ed1ab_0 conda-forge jupyter_core 5.3.1 py311h38be061_0 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.21.2 h659d440_0 conda-forge lcms2 2.15 haa2dc70_1 conda-forge ld_impl_linux-64 2.40 h41732ed_0 conda-forge lerc 4.0.0 h27087fc_0 conda-forge libabseil 20230125.3 cxx17_h59595ed_0 conda-forge libarrow 13.0.0 hb9dc469_0_cpu conda-forge libblas 3.9.0 17_linux64_openblas conda-forge libbrotlicommon 1.0.9 h166bdaf_9 conda-forge libbrotlidec 1.0.9 h166bdaf_9 conda-forge libbrotlienc 1.0.9 h166bdaf_9 conda-forge libcblas 3.9.0 17_linux64_openblas conda-forge libcrc32c 1.1.2 h9c3ff4c_0 conda-forge libcurl 8.2.1 hca28451_0 conda-forge libdeflate 1.18 h0b41bf4_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.12 hf998b51_1 conda-forge libexpat 2.5.0 hcb278e6_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 13.1.0 he5830b7_0 conda-forge libgfortran-ng 13.1.0 h69a702a_0 conda-forge libgfortran5 13.1.0 h15d22d2_0 conda-forge libgomp 13.1.0 he5830b7_0 conda-forge libgoogle-cloud 2.12.0 h840a212_1 conda-forge libgrpc 1.56.2 h3905398_1 conda-forge libjpeg-turbo 2.1.5.1 h0b41bf4_0 conda-forge liblapack 3.9.0 17_linux64_openblas conda-forge libllvm14 14.0.6 hcd5def8_4 conda-forge libnghttp2 1.52.0 h61bc06f_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libnuma 2.0.16 h0b41bf4_1 conda-forge libopenblas 0.3.23 pthreads_h80387f5_0 conda-forge libpng 1.6.39 h753d276_0 conda-forge libprotobuf 4.23.3 hd1fb520_0 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libsqlite 3.43.0 h2797004_0 conda-forge libssh2 1.11.0 h0841786_0 conda-forge libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge libthrift 0.18.1 h8fd135c_2 conda-forge libtiff 4.5.1 h8b53f26_1 conda-forge libutf8proc 2.8.0 h166bdaf_0 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libwebp-base 1.3.1 hd590300_0 conda-forge libxcb 1.15 h0b41bf4_0 conda-forge libzlib 1.2.13 hd590300_5 conda-forge llvmlite 0.40.1 py311ha6695c7_0 conda-forge locket 1.0.0 pyhd8ed1ab_0 conda-forge lz4 4.3.2 py311h9f220a4_0 conda-forge lz4-c 1.9.4 hcb278e6_0 conda-forge markupsafe 2.1.3 py311h459d7ec_0 conda-forge matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge msgpack-python 1.0.5 py311ha3edf6b_0 conda-forge nb_conda_kernels 2.3.1 py311h38be061_2 conda-forge ncurses 6.4 hcb278e6_0 conda-forge nest-asyncio 1.5.6 pyhd8ed1ab_0 conda-forge numba 0.57.1 py311h96b013e_0 conda-forge numpy 1.24.4 py311h64a7726_0 conda-forge openjpeg 2.5.0 hfec8fc6_2 conda-forge openssl 3.1.2 hd590300_0 conda-forge orc 1.9.0 h385abfd_1 conda-forge packaging 23.1 pyhd8ed1ab_0 conda-forge pandas 2.1.0 py311h320fe9a_0 conda-forge param 1.13.0 py_0 pyviz parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.4.0 pyhd8ed1ab_0 conda-forge pexpect 4.8.0 pyh1a96a4e_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 10.0.0 py311h0b84326_0 conda-forge pip 23.2.1 pyhd8ed1ab_0 conda-forge platformdirs 3.10.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.39 pyha770c72_0 conda-forge prompt_toolkit 3.0.39 hd8ed1ab_0 conda-forge psutil 5.9.5 py311h2582759_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pyarrow 13.0.0 py311h39c9aba_0_cpu conda-forge pygments 2.16.1 pyhd8ed1ab_0 conda-forge pysocks 1.7.1 pyha2e5f31_6 conda-forge python 3.11.5 hab00c5b_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-tzdata 2023.3 pyhd8ed1ab_0 conda-forge python_abi 3.11 3_cp311 conda-forge pytz 2023.3 pyhd8ed1ab_0 conda-forge pyyaml 6.0.1 py311h459d7ec_0 conda-forge pyzmq 25.1.1 py311h75c88c4_0 conda-forge rdma-core 28.9 h59595ed_1 conda-forge re2 2023.03.02 h8c504da_0 conda-forge readline 8.2 h8228510_1 conda-forge retrying 1.3.3 py_2 conda-forge s2n 1.3.49 h06160fa_0 conda-forge setuptools 68.1.2 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.10 h9fff704_0 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge spatialpandas 0.4.8 py_0 pyviz stack_data 0.6.2 pyhd8ed1ab_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge toolz 0.12.0 pyhd8ed1ab_0 conda-forge tornado 6.3.3 py311h459d7ec_0 conda-forge traitlets 5.9.0 pyhd8ed1ab_0 conda-forge typing-extensions 4.7.1 hd8ed1ab_0 conda-forge typing_extensions 4.7.1 pyha770c72_0 conda-forge tzdata 2023c h71feb2d_0 conda-forge ucx 1.14.1 h4a2ce2d_3 conda-forge urllib3 2.0.4 pyhd8ed1ab_0 conda-forge wcwidth 0.2.6 pyhd8ed1ab_0 conda-forge wheel 0.41.2 pyhd8ed1ab_0 conda-forge xorg-libxau 1.0.11 hd590300_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xyzservices 2023.7.0 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge yaml 0.2.5 h7f98852_2 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 3.0.0 pyhd8ed1ab_0 conda-forge zipp 3.16.2 pyhd8ed1ab_0 conda-forge zstd 1.5.5 hfc55251_0 conda-forge

Complete, minimal, self-contained example code that reproduces the issue

from spatialpandas import GeoDataFrame
from spatialpandas.geometry import PolygonArray

# Square from (0, 0) to (1, 1) in CCW order
outline0 = [0, 0, 1, 0, 1, 1, 0, 1, 0, 0]

# Square from (2, 2) to (5, 5) in CCW order
outline1 = [2, 2, 5, 2, 5, 5, 2, 5, 2, 2]

# Triangle hole in CW order
hole1 = [3, 3, 4, 3, 3, 4, 3, 3]

polygon_array = PolygonArray([
    [outline0],
    [outline1, hole1]
])

GeoDataFrame({"geometry": polygon_array})

Stack traceback and/or browser JavaScript console output

GeoDataFrame({"geometry": polygon_array}) GeoDataFrame({"geometry": polygon_array}) --------------------------------------------------------------------------- RecursionError Traceback (most recent call last) Cell In[8], line 1 ----> 1 GeoDataFrame({"geometry": polygon_array}) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geodataframe.py:31, in GeoDataFrame.__init__(self, data, index, geometry, **kwargs) 28 for col in self.columns: 29 if (isinstance(self[col].dtype, GeometryDtype) or 30 gp and isinstance(self[col].dtype, gp.array.GeometryDtype)): ---> 31 self[col] = GeoSeries(self[col]) 32 first_geometry_col = first_geometry_col or col 34 if first_geometry_col is None: File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.__init__(self, data, index, name, dtype, **kwargs) 32 dtype = pd.array([], dtype=dtype).dtype 34 data = to_geometry_array(data, dtype) ---> 35 super().__init__(data, index=index, name=name, **kwargs) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.__init__(self, data, index, dtype, name, copy, fastpath) 469 data = data._mgr.copy(deep=False) 470 else: --> 471 data = data.reindex(index, copy=copy) 472 copy = False 473 data = data._mgr File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance) 4965 @doc( 4966 NDFrame.reindex, # type: ignore[has-type] 4967 klass=_shared_doc_kwargs["klass"], (...) 4980 tolerance=None, 4981 ) -> Series: -> 4982 return super().reindex( 4983 index=index, 4984 method=method, 4985 copy=copy, 4986 level=level, 4987 fill_value=fill_value, 4988 limit=limit, 4989 tolerance=tolerance, 4990 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance) 5508 copy = False 5509 if all( 5510 self._get_axis(axis_name).identical(ax) 5511 for axis_name, ax in axes.items() 5512 if ax is not None 5513 ): -> 5514 return self.copy(deep=copy) 5516 # check if we are a multi reindex 5517 if self._needs_reindex_multi(axes, method, level): File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep) 6683 data = self._mgr.copy(deep=deep) 6684 self._clear_item_cache() -> 6685 return self._constructor_from_mgr(data, axes=data.axes).__finalize__( 6686 self, method="copy" 6687 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes) 587 else: 588 assert axes is mgr.axes --> 589 return self._constructor(ser, copy=False) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.__new__(cls, data, *args, **kwargs) 10 else: 11 series_cls = pd.Series ---> 12 return series_cls(data, *args, **kwargs) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.__init__(self, data, index, name, dtype, **kwargs) 32 dtype = pd.array([], dtype=dtype).dtype 34 data = to_geometry_array(data, dtype) ---> 35 super().__init__(data, index=index, name=name, **kwargs) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.__init__(self, data, index, dtype, name, copy, fastpath) 469 data = data._mgr.copy(deep=False) 470 else: --> 471 data = data.reindex(index, copy=copy) 472 copy = False 473 data = data._mgr File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance) 4965 @doc( 4966 NDFrame.reindex, # type: ignore[has-type] 4967 klass=_shared_doc_kwargs["klass"], (...) 4980 tolerance=None, 4981 ) -> Series: -> 4982 return super().reindex( 4983 index=index, 4984 method=method, 4985 copy=copy, 4986 level=level, 4987 fill_value=fill_value, 4988 limit=limit, 4989 tolerance=tolerance, 4990 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance) 5508 copy = False 5509 if all( 5510 self._get_axis(axis_name).identical(ax) 5511 for axis_name, ax in axes.items() 5512 if ax is not None 5513 ): -> 5514 return self.copy(deep=copy) 5516 # check if we are a multi reindex 5517 if self._needs_reindex_multi(axes, method, level): File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep) 6683 data = self._mgr.copy(deep=deep) 6684 self._clear_item_cache() -> 6685 return self._constructor_from_mgr(data, axes=data.axes).__finalize__( 6686 self, method="copy" 6687 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes) 587 else: 588 assert axes is mgr.axes --> 589 return self._constructor(ser, copy=False) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.__new__(cls, data, *args, **kwargs) 10 else: 11 series_cls = pd.Series ---> 12 return series_cls(data, *args, **kwargs) [... skipping similar frames: GeoSeries.__init__ at line 35 (327 times), Series.__init__ at line 471 (327 times), Series.reindex at line 4982 (327 times), NDFrame.reindex at line 5514 (327 times), _MaybeGeoSeries.__new__ at line 12 (326 times), Series._constructor_from_mgr at line 589 (326 times), NDFrame.copy at line 6685 (326 times)] File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6685, in NDFrame.copy(self, deep) 6683 data = self._mgr.copy(deep=deep) 6684 self._clear_item_cache() -> 6685 return self._constructor_from_mgr(data, axes=data.axes).__finalize__( 6686 self, method="copy" 6687 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:589, in Series._constructor_from_mgr(self, mgr, axes) 587 else: 588 assert axes is mgr.axes --> 589 return self._constructor(ser, copy=False) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:12, in _MaybeGeoSeries.__new__(cls, data, *args, **kwargs) 10 else: 11 series_cls = pd.Series ---> 12 return series_cls(data, *args, **kwargs) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/spatialpandas/geoseries.py:35, in GeoSeries.__init__(self, data, index, name, dtype, **kwargs) 32 dtype = pd.array([], dtype=dtype).dtype 34 data = to_geometry_array(data, dtype) ---> 35 super().__init__(data, index=index, name=name, **kwargs) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:471, in Series.__init__(self, data, index, dtype, name, copy, fastpath) 469 data = data._mgr.copy(deep=False) 470 else: --> 471 data = data.reindex(index, copy=copy) 472 copy = False 473 data = data._mgr File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/series.py:4982, in Series.reindex(self, index, axis, method, copy, level, fill_value, limit, tolerance) 4965 @doc( 4966 NDFrame.reindex, # type: ignore[has-type] 4967 klass=_shared_doc_kwargs["klass"], (...) 4980 tolerance=None, 4981 ) -> Series: -> 4982 return super().reindex( 4983 index=index, 4984 method=method, 4985 copy=copy, 4986 level=level, 4987 fill_value=fill_value, 4988 limit=limit, 4989 tolerance=tolerance, 4990 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:5514, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance) 5508 copy = False 5509 if all( 5510 self._get_axis(axis_name).identical(ax) 5511 for axis_name, ax in axes.items() 5512 if ax is not None 5513 ): -> 5514 return self.copy(deep=copy) 5516 # check if we are a multi reindex 5517 if self._needs_reindex_multi(axes, method, level): File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/generic.py:6683, in NDFrame.copy(self, deep) 6551 @final 6552 def copy(self, deep: bool_t | None = True) -> Self: 6553 """ 6554 Make a copy of this object's indices and data. 6555 (...) 6681 dtype: int64 6682 """ -> 6683 data = self._mgr.copy(deep=deep) 6684 self._clear_item_cache() 6685 return self._constructor_from_mgr(data, axes=data.axes).__finalize__( 6686 self, method="copy" 6687 ) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/managers.py:576, in BaseBlockManager.copy(self, deep) 573 else: 574 new_axes = list(self.axes) --> 576 res = self.apply("copy", deep=deep) 577 res.axes = new_axes 579 if self.ndim > 1: 580 # Avoid needing to re-compute these File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/managers.py:354, in BaseBlockManager.apply(self, f, align_keys, **kwargs) 352 applied = b.apply(f, **kwargs) 353 else: --> 354 applied = getattr(b, f)(**kwargs) 355 result_blocks = extend_blocks(applied, result_blocks) 357 out = type(self).from_blocks(result_blocks, self.axes) File /glade/work/philipc/conda-envs/spatialpandas-pandas-210/lib/python3.11/site-packages/pandas/core/internals/blocks.py:649, in Block.copy(self, deep) 647 else: 648 refs = self.refs --> 649 return type(self)(values, placement=self._mgr_locs, ndim=self.ndim, refs=refs) File internals.pyx:680, in pandas._libs.internals.SharedBlock.__cinit__() File internals.pyx:962, in pandas._libs.internals.BlockValuesRefs.add_reference() RecursionError: maximum recursion depth exceeded while calling a Python object