scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.9k stars 597 forks source link

Math domain error when using the Wilcoxon rank-sum #566

Open davidepisu opened 5 years ago

davidepisu commented 5 years ago

Another error I get and have no idea how to solve is when using the Wilcoxon rank-sum for testing for differential gene expression:

sc.tl.rank_genes_groups(adata, 'louvain', method='wilcoxon') sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)

ranking genes

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-385-c2fa7bb8ea8d> in <module>
----> 1 sc.tl.rank_genes_groups(adata, 'louvain', method='wilcoxon')
      2 sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)

~\AppData\Local\conda\conda\envs\Scanpy\lib\site-packages\scanpy\tools\_rank_genes_groups.py in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, key_added, copy, method, corr_method, **kwds)
    352 
    353                 scores[imask, :] = (scores[imask, :] - (ns[imask] * (n_cells + 1) / 2)) / sqrt(
--> 354                     (ns[imask] * (n_cells - ns[imask]) * (n_cells + 1) / 12))
    355                 scores[np.isnan(scores)] = 0
    356                 pvals = 2 * stats.distributions.norm.sf(np.abs(scores[imask,:]))

ValueError: math domain error

The logistic regression and t-test work fine. I guess it is related to my data....

flying-sheep commented 5 years ago
>>> from math import sqrt                                                                                                                                                                                                                
>>> sqrt(-1)
ValueError: math domain error

I assume it’s the square root throwing this. Assuming that it only happens when you pass a negative argument, the term inside can only become negative if ns[imask] < 0 or ns[imask] > n_cells

amitbin1 commented 4 years ago

Hey, So I don't understand how I can get around this issue with the wilcoxon test. I'm following the scanpy tutorial and getting this 'ValueError: math domain error'.

bioguy2018 commented 4 years ago

Hi, I am also still receiving this error!


ValueError Traceback (most recent call last)

in ----> 1 sc.tl.rank_genes_groups(adata, 'louvain_05',n_genes=100,method="wilcoxon",use_raw=False) e:\programs\python\python38\lib\site-packages\scanpy\tools\_rank_genes_groups.py in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, key_added, copy, method, corr_method, layer, **kwds) 398 mean_rest, var_rest = _get_mean_var(X[mask_rest]) 399 --> 400 scores[imask, :] = (scores[imask, :] - (ns[imask] * (n_cells + 1) / 2)) / sqrt( 401 (ns[imask] * (n_cells - ns[imask]) * (n_cells + 1) / 12)) 402 scores[np.isnan(scores)] = 0 ValueError: math domain error

How can I deal with it? I though it was a bug that is fixed now!