SingleR-inc / SingleR

Clone of the Bioconductor repository for the SingleR package.
https://bioconductor.org/packages/devel/bioc/html/SingleR.html
GNU General Public License v3.0
173 stars 19 forks source link

no common genes between 'test' and 'ref' for multiple references #217

Closed matt-sd-watson closed 2 years ago

matt-sd-watson commented 2 years ago

Hello,

I am attempting to run SingleR on a mixed species sample of mouse and human. The two references I am using for this annotation are:

human_cell_ref_with_mouse <- HumanPrimaryCellAtlasData() immgen_ref <- celldex::ImmGenData()

When I run each separately, the annotation is successful. However, when I combine the references as described here:

references_to_use <- list(atlas = human_cell_ref_with_mouse, immgen = immgen_ref)

labels <- list(human_cell_ref_with_mouse$label.main, immgen_ref$label.main)

SingleR(test = sce_610_7, ref = references_to_use, assay.type.test=1, labels = labels, de.method = "wilcox")

I get the following error:

no common genes between 'test' and 'ref'

I am struggling to understand why separately, the annotations are successful, but it does not work when running in this combined configuration. I can confirm that the overlap of gene names is sufficient for annotation for both references:

table(rownames(immgen_ref) %in% rownames(integrated_610-7@assays$RNA@counts))

FALSE TRUE 1772 20362

table(rownames(human_cell_ref_with_mouse) %in% rownames(integrated_610-7@assays$RNA@counts))

FALSE TRUE 1313 18050

LTLA commented 2 years ago

There aren't any genes in the intersection of all references and your test data. I would not expect to be able to compare between mouse and human references, which is what the integrated calling attempts to do - hence the error.

matt-sd-watson commented 2 years ago

Thank you for this information. Given this lack of intersection among the references and test data, what is the recommended approach for annotating cell types for a sample of mixed species I.e. a samples with both human and mouse-aligned reads?

LTLA commented 2 years ago

In such cases, I would have expected that the first order of business would be to figure out whether each cell comes from human or mouse. Once you have that knowledge, it should be straightforward to apply the appropriate reference. If you don't know whether something is human or mouse, cell type annotation seems like the least of your concerns.