scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.92k stars 599 forks source link

mnn_correct #1367

Closed aheravi closed 1 year ago

aheravi commented 4 years ago

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Minimal code sample (that we can copy&paste without having any data)

Hi Scanpy, I noticed the error was kind of mnnpy related and after checking the issues there and updating two suggested packages, still getting the error.

adata_mnn = adata.copy()
adata_list = [adata_mnn[adata_mnn.obs['sample'] == i] for i in adata_mnn.obs['sample'].unique()]
adata_mnn, _, _ = sc.external.pp.mnn_correct(*adata_list, batch_key="sample")
Performing cosine normalization...
Starting MNN correct iteration. Reference batch: 0
Step 1 of 4: processing batch 1
  Looking for MNNs...
  Computing correction vectors...
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-49-f894e9f745f6> in <module>
----> 1 adata_mnn, _, _ = sc.external.pp.mnn_correct(*adata_list, batch_key="sample")

/projects/da_workspace/Users/amoussavi/Software/Anaconda_python3/lib/python3.7/site-packages/scanpy/external/pp/_mnn_correct.py in mnn_correct(var_index, var_subset, batch_key, index_unique, batch_categories, k, sigma, cos_norm_in, cos_norm_out, svd_dim, var_adj, compute_angle, mnn_order, svd_mode, do_concatenate, save_raw, n_jobs, *datas, **kwargs)
    152         save_raw=save_raw,
    153         n_jobs=n_jobs,
--> 154         **kwargs,
    155     )
    156     return datas, mnn_list, angle_list

/Anaconda_python3/lib/python3.7/site-packages/mnnpy/mnn.py in mnn_correct(var_index, var_subset, batch_key, index_unique, batch_categories, k, sigma, cos_norm_in, cos_norm_out, svd_dim, var_adj, compute_angle, mnn_order, svd_mode, do_concatenate, save_raw, n_jobs, *datas, **kwargs)
    124                                 cos_norm_out=cos_norm_out, svd_dim=svd_dim, var_adj=var_adj,
    125                                 compute_angle=compute_angle, mnn_order=mnn_order,
--> 126                                 svd_mode=svd_mode, do_concatenate=do_concatenate, **kwargs)
    127         print('Packing AnnData object...')
    128         if do_concatenate:

/Anaconda_python3/lib/python3.7/site-packages/mnnpy/mnn.py in mnn_correct(var_index, var_subset, batch_key, index_unique, batch_categories, k, sigma, cos_norm_in, cos_norm_out, svd_dim, var_adj, compute_angle, mnn_order, svd_mode, do_concatenate, save_raw, n_jobs, *datas, **kwargs)
    180         print('  Computing correction vectors...')
    181         correction_in = compute_correction(ref_batch_in, new_batch_in, mnn_ref, mnn_new,
--> 182                                            new_batch_in, sigma)
    183         if not same_set:
    184             correction_out = compute_correction(ref_batch_out, new_batch_out, mnn_ref, mnn_new,

IndexError: arrays used as indices must be of integer (or boolean) type

Versions

scanpy==1.5.1 anndata==0.7.1 umap==0.4.6 numpy==1.19.1 scipy==1.5.0 pandas==1.0.5 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.8.2 louvain==0.7.0 leidenalg==0.8.1 numba==0.50.1 llvmlite==0.33.0+1.g022ab0f
flying-sheep commented 4 years ago

Hi. I can’t copy/paste/run the above code sample. There’s nowhere you define adata.

Please add some code that uses some builtin dataset or so and reproduces the error.

aheravi commented 4 years ago

Hi, That was from my own datasets, but I also used the data from here, https://github.com/theislab/single-cell-tutorial/blob/master/latest_notebook/Case-study_Mouse-intestinal-epithelium_1906.ipynb, and got the same error.

adata_mnn = adata.copy()
adata_list = [adata_mnn[adata_mnn.obs['sample'] == i] for i in adata_mnn.obs['sample'].unique()]
adata_list
[View of AnnData object with n_obs × n_vars = 2267 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts',
 View of AnnData object with n_obs × n_vars = 1976 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts',
 View of AnnData object with n_obs × n_vars = 1663 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts',
 View of AnnData object with n_obs × n_vars = 2356 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts',
 View of AnnData object with n_obs × n_vars = 2422 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts',
 View of AnnData object with n_obs × n_vars = 1773 × 12818
     obs: 'sample', 'region', 'donor', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors'
     var: 'gene_id', 'n_cells'
     uns: 'log1p'
     layers: 'counts']
import mnnpy
corrected = mnnpy.mnn_correct(*adata_list, batch_key="sample")
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-35-7ad830fcd907> in <module>
      1 import mnnpy
----> 2 corrected = mnnpy.mnn_correct(*adata_list, batch_key="sample")

/Anaconda_python3/lib/python3.7/site-packages/mnnpy/mnn.py in mnn_correct(var_index, var_subset, batch_key, index_unique, batch_categories, k, sigma, cos_norm_in, cos_norm_out, svd_dim, var_adj, compute_angle, mnn_order, svd_mode, do_concatenate, save_raw, n_jobs, *datas, **kwargs)
    124                                 cos_norm_out=cos_norm_out, svd_dim=svd_dim, var_adj=var_adj,
    125                                 compute_angle=compute_angle, mnn_order=mnn_order,
--> 126                                 svd_mode=svd_mode, do_concatenate=do_concatenate, **kwargs)
    127         print('Packing AnnData object...')
    128         if do_concatenate:

/Anaconda_python3/lib/python3.7/site-packages/mnnpy/mnn.py in mnn_correct(var_index, var_subset, batch_key, index_unique, batch_categories, k, sigma, cos_norm_in, cos_norm_out, svd_dim, var_adj, compute_angle, mnn_order, svd_mode, do_concatenate, save_raw, n_jobs, *datas, **kwargs)
    180         print('  Computing correction vectors...')
    181         correction_in = compute_correction(ref_batch_in, new_batch_in, mnn_ref, mnn_new,
--> 182                                            new_batch_in, sigma)
    183         if not same_set:
    184             correction_out = compute_correction(ref_batch_out, new_batch_out, mnn_ref, mnn_new,

IndexError: arrays used as indices must be of integer (or boolean) type
ivirshup commented 4 years ago

This seems related to (and a possible duplicate of) https://github.com/theislab/scanpy/issues/1167, and may be fixed here: https://github.com/chriscainx/mnnpy/pull/41 pending a merge.

eroell commented 1 year ago

As we haven't heard back after the followup link of the fix, we will close the issue for now, hopefully you obtained the expected behaviour in the end :)

However, please don't hesitate to reopen this issue or create a new one if you have any more questions or run into any related problems in the future.

Thanks for being a part of our community! :)