Open dertrotl opened 1 month ago
Hi @dertrotl , thanks for reporting the issue. Here is something to try for now, let me know if it works. I will try to address this in the next update:
# remove index name from cell GeoDataFrame
del sdata['cell_boundaries'].index.name
sdata = bt.io.prep(sdata)
Hey @ckmah,
thank you very much for your reply! I tested your suggestion, which unfortunately didn't work in my case.
del sdata['cell_boundaries'].index.name
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 del sdata['cell_boundaries'].index.name
AttributeError: can't delete attribute 'name'
Also tried to remove the index names like this:
sdata['cell_boundaries'].index.name = None
, which caused the following error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File [../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/indexes/base.py:3805](../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/indexes/base.py#line=3804), in Index.get_loc(self, key)
3804 try:
-> 3805 return self._engine.get_loc(casted_key)
3806 except KeyError as err:
File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()
File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()
File pandas[/_libs/hashtable_class_helper.pxi:7081](http://localhost:1235/_libs/hashtable_class_helper.pxi#line=7080), in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas[/_libs/hashtable_class_helper.pxi:7089](http://localhost:1235/_libs/hashtable_class_helper.pxi#line=7088), in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'index_right'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Cell In[11], line 1
----> 1 sdata = bt.io.prep(sdata)
File [../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_io.py:87](../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_io.py#line=86), in prep(sdata, points_key, feature_key, instance_key, shape_keys)
85 if len(shape_sjoin) > 0:
86 pbar.set_description("Mapping shapes")
---> 87 sdata = _sjoin_shapes(
88 sdata=sdata, instance_key=instance_key, shape_keys=shape_sjoin
89 )
91 pbar.update()
93 # Only keep points within instance_key shape
File [../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_index.py:111](../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_index.py#line=110), in _sjoin_shapes(sdata, instance_key, shape_keys)
107 child_shape = gpd.GeoDataFrame(geometry=child_shape.buffer(-10e-6))
109 # Map child shape index to parent shape and process the result
110 parent_shape = (
--> 111 parent_shape.sjoin(child_shape, how="left", predicate="covers")
112 .reset_index()
113 .drop_duplicates(subset="index", keep="last")
114 .set_index("index")
115 .assign(
116 index_right=lambda df: df.loc[
117 ~df["index_right"].duplicated(keep="first"), "index_right"
118 ]
119 .fillna("")
120 .astype("category")
121 )
122 .rename(columns={"index_right": shape_key})
123 )
124 parent_shape[shape_key] = parent_shape[shape_key].fillna("")
126 # Save shape index as column in instance_key shape
File [../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/frame.py:5239](../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/frame.py#line=5238), in DataFrame.assign(self, **kwargs)
5236 data = self.copy(deep=None)
5238 for k, v in kwargs.items():
-> 5239 data[k] = com.apply_if_callable(v, data)
5240 return data
File [../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/common.py:384](../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/common.py#line=383), in apply_if_callable(maybe_callable, obj, **kwargs)
373 """
374 Evaluate possibly callable input using obj and kwargs if it is callable,
375 otherwise return as it is.
(...)
381 **kwargs
382 """
383 if callable(maybe_callable):
--> 384 return maybe_callable(obj, **kwargs)
386 return maybe_callable
File [../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_index.py:117](../micromamba/envs/bento/lib/python3.10/site-packages/bento/io/_index.py#line=116), in _sjoin_shapes.<locals>.<lambda>(df)
107 child_shape = gpd.GeoDataFrame(geometry=child_shape.buffer(-10e-6))
109 # Map child shape index to parent shape and process the result
110 parent_shape = (
111 parent_shape.sjoin(child_shape, how="left", predicate="covers")
112 .reset_index()
113 .drop_duplicates(subset="index", keep="last")
114 .set_index("index")
115 .assign(
116 index_right=lambda df: df.loc[
--> 117 ~df["index_right"].duplicated(keep="first"), "index_right"
118 ]
119 .fillna("")
120 .astype("category")
121 )
122 .rename(columns={"index_right": shape_key})
123 )
124 parent_shape[shape_key] = parent_shape[shape_key].fillna("")
126 # Save shape index as column in instance_key shape
File [../micromamba/envs/bento/lib/python3.10/site-packages/geopandas/geodataframe.py:1750](../micromamba/envs/bento/lib/python3.10/site-packages/geopandas/geodataframe.py#line=1749), in GeoDataFrame.__getitem__(self, key)
1744 def __getitem__(self, key):
1745 """
1746 If the result is a column containing only 'geometry', return a
1747 GeoSeries. If it's a DataFrame with any columns of GeometryDtype,
1748 return a GeoDataFrame.
1749 """
-> 1750 result = super().__getitem__(key)
1751 # Custom logic to avoid waiting for pandas GH51895
1752 # result is not geometry dtype for multi-indexes
1753 if (
1754 pd.api.types.is_scalar(key)
1755 and key == ""
(...)
1758 and not is_geometry_type(result)
1759 ):
File [../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/frame.py:4102](../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/frame.py#line=4101), in DataFrame.__getitem__(self, key)
4100 if self.columns.nlevels > 1:
4101 return self._getitem_multilevel(key)
-> 4102 indexer = self.columns.get_loc(key)
4103 if is_integer(indexer):
4104 indexer = [indexer]
File [../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/indexes/base.py:3812](../micromamba/envs/bento/lib/python3.10/site-packages/pandas/core/indexes/base.py#line=3811), in Index.get_loc(self, key)
3807 if isinstance(casted_key, slice) or (
3808 isinstance(casted_key, abc.Iterable)
3809 and any(isinstance(x, slice) for x in casted_key)
3810 ):
3811 raise InvalidIndexError(key)
-> 3812 raise KeyError(key) from err
3813 except TypeError:
3814 # If we have a listlike key, _check_indexing_error will raise
3815 # InvalidIndexError. Otherwise we fall through and re-raise
3816 # the TypeError.
3817 self._check_indexing_error(key)
KeyError: 'index_right'
Hi @dertrotl and @ckmah,
I've run into the same issue and the fix for me was to ensure that the index name of all data frames in shapes
is None
as otherwise the columns added by sjoin
are the index names instead of 'index' and 'index_right'.
In your particular case, @dertrotl , could it be that either 'nucleus_boundaries' or 'cell_circles' have named indices too?
Happy to draft up a pull-request that does index name checks within _sjoin_shapes
if that helps!
Hi @nklkhlr,
thank you for your reply. Can confirm, that your solution fixed the issue. Thanks a lot!
Hey,
first of all thank you very much for your great package! However, when trying it out, I already got an error message when trying to execute the
bt.io.prep
function (see screenshot)Hope you can help me with my issue!
Some session infos:
GeoPandas 1.0.1 Spatial Data 0.2.3 Python 3.10 bento-tools 2.1.3