saezlab / liana-py

LIANA+: an all-in-one framework for cell-cell communication
http://liana-py.readthedocs.io/
GNU General Public License v3.0
167 stars 21 forks source link

li.mt.bivariate: No common columns to perform merge on #143

Open rcannood opened 2 months ago

rcannood commented 2 months ago

I'm trying to run the data_processors/infer_truth component in openproblems-bio/task_cell_cell_communication project. This code is no longer working, and it isn't obvious to me how to fix it :sweat_smile:

Contents of script.py:

import anndata as ad
import liana as li

## VIASH START
par = {
  "input": "dataset.h5ad",
  "output": "output.h5ad"
}
## VIASH END

# read the dataset
adata = ad.read_h5ad(par["input"])
adata.X = adata.layers["counts"]
adata.var.index = adata.var["feature_name"]

# one hot encode cell types
li.ut.spatial_neighbors(adata, bandwidth=1000, max_neighbours=10)

# get organism
organism = adata.uns['dataset_organism']
resource_name_map = {
  "homo_sapiens": "consensus",
  "mus_musculus": "mouseconsensus"
}

lr = li.mt.bivariate(adata,
                     global_name='morans',
                     local_name=None,
                     use_raw=False,
                     resource_name=resource_name_map[organism],
                     verbose=True,
                     n_perms=1000)

# ...

The error message I get is:

Using `.X`!
Converting to sparse csr matrix!
/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/anndata/_core/anndata.py:430: FutureWarning: The dtype argument is deprecated and will be removed in late 2024.
5159 features of mat are empty, they will be removed.
Make sure that normalized counts are passed!
Using resource `consensus`.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/liana/method/sp/_bivariate/_spatial_bivariate.py", line 179, in __call__
    xy_stats = resource.merge(self._rename_means(xy_stats, entity=self.x_name)).merge(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/pandas/core/frame.py", line 10832, in merge
    return merge(
           ^^^^^^
  File "/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 170, in merge
    op = _MergeOperation(
         ^^^^^^^^^^^^^^^^
  File "/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 786, in __init__
    self.left_on, self.right_on = self._validate_left_right_on(left_on, right_on)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rcannood/.conda/envs/py3.11/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 1572, in _validate_left_right_on
    raise MergeError(
pandas.errors.MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False

Here is the dataset h5ad in a zip:

dataset.zip

@dbdimitrov Would you be able to help out with getting this script to work again?

dbdimitrov commented 1 month ago

Hi @rcannood,

It looks like a silly edgecase. Basically, in the AnnData we have a column called feature_name which is the same as the name of the index, so when it tries to join on the index it throws an error.

Adding the following line before running liana solves the issue: adata.var.index.name = None

I have addressed for the next update: https://github.com/saezlab/liana-py/commit/efe6aa9d87db49db6daec695ec10a81fbeacc5a4 :)

Let me know if any other issues arise.