theislab / scgen

Single cell perturbation prediction
https://scgen.readthedocs.io
GNU General Public License v3.0
255 stars 51 forks source link

Batch_removal() AttributeError: Can only use .str accessor with string values! #92

Closed AlinaKurjan closed 4 months ago

AlinaKurjan commented 4 months ago

I get the attribute error when running the following:

corrected_adata = model.batch_removal()
corrected_adata
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[120], line 1
----> 1 corrected_adata = model.batch_removal()
      2 corrected_adata

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/scgen/_scgen.py:268, in SCGEN.batch_removal(self, adata)
    266         temp_cell[batch_ind[study]].X = batch_list[study].X
    267     shared_ct.append(temp_cell)
--> 268 all_shared_ann = AnnData.concatenate(
    269     *shared_ct, batch_key="concat_batch", index_unique=None
    270 )
    271 if "concat_batch" in all_shared_ann.obs.columns:
    272     del all_shared_ann.obs["concat_batch"]

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/anndata/_core/anndata.py:1808, in AnnData.concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1799 pat = rf"-({'|'.join(batch_categories)})$"
   1800 out.var = merge_dataframes(
   1801     [a.var for a in all_adatas],
   1802     out.var_names,
   1803     partial(merge_outer, batch_keys=batch_categories, merge=merge_same),
   1804 )
   1805 out.var = out.var.iloc[
   1806     :,
   1807     (
-> 1808         out.var.columns.str.extract(pat, expand=False)
   1809         .fillna("")
   1810         .argsort(kind="stable")
   1811     ),
   1812 ]
   1814 return out

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/pandas/core/accessor.py:224, in CachedAccessor.__get__(self, obj, cls)
    221 if obj is None:
    222     # we're accessing the attribute of the class, i.e., Dataset.geo
    223     return self._accessor
--> 224 accessor_obj = self._accessor(obj)
    225 # Replace the property with the accessor object. Inspired by:
    226 # https://www.pydanny.com/cached-property.html
    227 # We need to use object.__setattr__ because we overwrite __setattr__ on
    228 # NDFrame
    229 object.__setattr__(obj, self._name, accessor_obj)

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/pandas/core/strings/accessor.py:181, in StringMethods.__init__(self, data)
    178 def __init__(self, data) -> None:
    179     from pandas.core.arrays.string_ import StringDtype
--> 181     self._inferred_dtype = self._validate(data)
    182     self._is_categorical = is_categorical_dtype(data.dtype)
    183     self._is_string = isinstance(data.dtype, StringDtype)

File ~/conda/envs/scvi-env/lib/python3.9/site-packages/pandas/core/strings/accessor.py:235, in StringMethods._validate(data)
    232 inferred_dtype = lib.infer_dtype(values, skipna=True)
    234 if inferred_dtype not in allowed_types:
--> 235     raise AttributeError("Can only use .str accessor with string values!")
    236 return inferred_dtype

AttributeError: Can only use .str accessor with string values!

Package vers:

anndata     0.9.1
scanpy      1.9.3
pandas              2.0.3
scgen                       2.1.1
scvi                        1.0.2

Python 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:39:03) [GCC 11.3.0]
Linux-5.15.0-91-generic-x86_64-with-glibc2.31

Any ideas for what could be causing this? I am suspecting a problem with pandas dependency but not sure if it is something simpler.

AlinaKurjan commented 4 months ago

Downgrading to pandas 1.5.3 fixes the issue