Closed outlace closed 5 years ago
This certainly looks strange. Would you mind making a minimal complete example? If you need, here's a useful guide to making one.
Sorry, there is a small bug in the wilcoxon method, that might hit sometimes. @a-munoz-rojas, it should be resolved after merging your fix, don't you think so? I'd be happy to move forward as soon as the code-overhead issue around double logarithmization is fixed. Should be very simple. Thank you!
Hi @falexwolf - I think it be resolved with the bug fix, but I'm not entirely sure. I just pushed the changes addressing the code-overhead issue, we can give it a try to see if it fixes it. Otherwise we'd have to take a closer look.
@outlace, any chance you've tried this since #519 got merged?
@ivirshup I haven't tried it yet, sorry. But you can close the issue. if I still have trouble when I get back to the analysis I can re-open it
Great, thank you!
Oh no worries! I was just wondering if this had been fixed (not sure how to reproduce) since I think #566 is a duplicate issue.
Hi guys,
Sorry to re-open the thread but I am also getting the same error as described by the OP above with the latest release of Scanpy (v1.4.3).
As with #566 I have no trouble running the rank_genes_groups using t-test or logreg, the problem only arises when using method='wilcoxon'. For the example provided below adata
is using one of our real data sets and I am using custom clusters but I have the same problem when replicating the tutorial workflow. My colleague has had the same experience when trying to work through the tutorial on his system, again using the most recent release of Scanpy.
I've made sure 'log_transformed' was being applied to include the #519 fix provided by a-munoz-rojas in the hope that this might help but no such luck, I get the same error either way. Any ideas?
Here is what I ran:
sc.tl.rank_genes_groups(adata, groupby='Custom_clusters_Leiden', groups=['NKT1','NKT2','NKT17','IL2+ aNKT1','aNKT2 & aNKT17','TNF- aNKT1'], reference='rest', method='wilcoxon', corr_method='benjamini-hochberg', log_transformed=True)
sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)
And here is the resulting error:
ValueError Traceback (most recent call last)
<ipython-input-117-a5ba74ea872c> in <module>
----> 1 sc.tl.rank_genes_groups(adata, groupby='Custom_clusters_Leiden', groups=['NKT1','NKT2','NKT17','IL2+ aNKT1','aNKT2 & aNKT17','TNF- aNKT1'], reference='rest', method='wilcoxon', corr_method='benjamini-hochberg', log_transformed=True)
2 sc.pl.rank_genes_groups(adata, n_genes=25, sharey=False)
~\Anaconda3\lib\site-packages\scanpy\tools\_rank_genes_groups.py in rank_genes_groups(adata, groupby, use_raw, groups, reference, n_genes, rankby_abs, key_added, copy, method, corr_method, log_transformed, **kwds)
367
368 scores[imask, :] = (scores[imask, :] - (ns[imask] * (n_cells + 1) / 2)) / sqrt(
--> 369 (ns[imask] * (n_cells - ns[imask]) * (n_cells + 1) / 12))
370 scores[np.isnan(scores)] = 0
371 pvals = 2 * stats.distributions.norm.sf(np.abs(scores[imask,:]))
ValueError: math domain error
P.S I just want to say thank you for all the work on Scanpy, loving it so far!
I get the error below when trying to run the following:
>>> sc.tl.rank_genes_groups(adata, 'louvain', groups=['5','16','19','30'], reference='0', method='wilcoxon')
Here
adata
is real data from our lab, not the tutorial data. Have been trying to replicate the cluster analysis tutorial. All previous steps work fine. Interestingly, if I remove group '5' from the list of groups it works. Also, this error only happens with thewilcoxon
method, not witht-test
.