scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.92k stars 602 forks source link

Specify var subsets for stats testing #1744

Open kanefos opened 3 years ago

kanefos commented 3 years ago

Hi authors,

First off, love scanpy. Big fan.

I was just wondering if you have considered including an option in scanpy.tl.rank_genes_groups to specify which variables to select for testing, allowing users to select a subset of variables which would or would not be considered in the statistical test.

For context, I'm trying to test between groups of cells while ignoring ribosomal / mitochondrial genes, but retain them in the .var and .X objects for downstream analysis/visualisation. Making a temp object with these variables removed solely for stats testing partially works, but it's confounded by having to further apply the boolean slice to the .raw object as well.

Thanks, K

ivirshup commented 3 years ago

I've been thinking it would be good to add a mask argument to a number of functions. I think mask_vars=~(adata.var["mito"] | adata.var["ribo"]) could work here.