theislab / diffxpy

Differential expression analysis for single-cell RNA-seq data.
https://diffxpy.rtfd.io
BSD 3-Clause "New" or "Revised" License
180 stars 23 forks source link

Check for single group passed for tests #146

Open dburkhardt opened 4 years ago

dburkhardt commented 4 years ago

Current behavior:

import scanpy as sc
import numpy as np
import diffxpy.api as de

adata = sc.AnnData(np.random.normal(size=(100,10)))
de.test.rank_sum(adata, grouping=np.tile('a', 100))

raises:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-0d8486632427> in <module>
      4
      5 adata = sc.AnnData(np.random.normal(size=(100,10)))
----> 6 de.test.rank_test(adata, grouping=np.tile('a', 100))

~/.local/lib/python3.8/site-packages/diffxpy/testing/tests.py in rank_test(data, grouping, gene_names, sample_description, is_logged, is_sig_zerovar)
    894     grouping = parse_grouping(data, sample_description, grouping)
    895
--> 896     de_test = DifferentialExpressionTestRank(
    897         data=data,
    898         sample_description=sample_description,

~/.local/lib/python3.8/site-packages/diffxpy/testing/det.py in __init__(self, data, sample_description, grouping, gene_names, is_logged, is_sig_zerovar)
   1689         self._gene_names = np.asarray(gene_names)
   1690
-> 1691         x0, x1 = split_x(data, grouping)
   1692
   1693         mean_x0 = np.asarray(np.mean(x0, axis=0)).flatten().astype(dtype=np.float)

~/.local/lib/python3.8/site-packages/diffxpy/testing/utils.py in split_x(data, grouping)
    114     groups = np.unique(grouping)
    115     x0 = data[np.where(grouping == groups[0])[0]]
--> 116     x1 = data[np.where(grouping == groups[1])[0]]
    117     return x0, x1
    118

IndexError: index 1 is out of bounds for axis 0 with size 1

Desired behavior:

if np.unique(grouping).shape[0] < 2:
    raise ValueError('`grouping` must have more than one unique value.')
dburkhardt commented 4 years ago

I came across this while doing de-testing across a bunch of clustering arrangements withing clusters. I likely won't be the only one to come across this issue!

I'll send a PR tomorrow if I get some time