saezlab / decoupler-py

Python package to perform enrichment analysis from omics data.
GNU General Public License v3.0
154 stars 23 forks source link

dc.get_pseudobulk needs to use '.todense()' instead of '.A' #146

Closed jhaberbe closed 2 months ago

jhaberbe commented 2 months ago

Describe the bug When computing pseudobulk using the following code:

counts = dc.get_pseudobulk(

I get:

AttributeError                            Traceback (most recent call last)
Cell In[34], line 2
      1 import decoupler as dc
----> 2 counts = dc.get_pseudobulk(
      3     subset,
      4     sample_col='specimen',
      5     groups_col='cell_type',
      6     layer='counts',
      7     mode='sum',
      8     min_cells=0,
      9     min_counts=0
     10 )

File [/oak/stanford/projects/kibr/Reorganizing/Projects/4Jul2024_PBMC_ONT/.venv/lib/python3.12/site-packages/decoupler/](, in get_pseudobulk(adata, sample_col, groups_col, obs, layer, use_raw, mode, min_cells, min_counts, dtype, skip_checks, min_prop, min_smpls, remove_empty)
    377     layers['psbulk_props'] = props
    378 elif type(mode) is str or callable(mode):
    379     # Compute psbulk
--> 380     psbulk, ncells, counts, props = compute_psbulk(n_rows, n_cols, X, sample_col, groups_col, smples, groups, obs,
    381                                                    new_obs, min_cells, min_counts, mode, dtype)
    382     layers = {'psbulk_props': props}
    384 # Add QC metrics

File [/oak/stanford/projects/kibr/Reorganizing/Projects/4Jul2024_PBMC_ONT/.venv/lib/python3.12/site-packages/decoupler/](, in compute_psbulk(n_rows, n_cols, X, sample_col, groups_col, smples, groups, obs, new_obs, min_cells, min_counts, mode, dtype)
    262 profile = X[(obs[sample_col] == smp) & (obs[groups_col] == grp)]
    263 if isinstance(X, csr_matrix):
--> 264     profile = profile.A
    266 # Skip if few cells or not enough counts
    267 ncell = profile.shape[0]

AttributeError: 'SparseCSRView' object has no attribute 'A'

Expected behavior I expect pseudobulking.

System NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="" BUG_REPORT_URL=""


Additional context The problem is just that newer versions of anndata don't seem to like using the .A accessor, and instead want to use the .todense accessor. When I went into the module code and changed the accessor to '.todense()', it fixed the problem.

PauBadiaM commented 2 months ago

Hi @jhaberbe,

Indeed, the new update of scipy has deprecated the use of .A, see I made a quick patch to fix it that can be installed running:

pip install git+

Hope this is helpful!

jhaberbe commented 2 months ago

Oh shoot my B, I'm three weeks late. Thanks!