SingleR-inc / SingleR

Clone of the Bioconductor repository for the SingleR package.
https://bioconductor.org/packages/devel/bioc/html/SingleR.html
GNU General Public License v3.0
165 stars 19 forks source link

raw data or logNormalized data? #228

Closed nickhir closed 1 year ago

nickhir commented 1 year ago

I am trying to run SingleR after my Seurat integration. I have looked at different posts and found that I should use the RNA assay (and not the integrated).

However, depending on which post I check, notice that some people say we should use the "raw" counts (e.g. #98 or #185) , while others say we should use the "log-normalized" counts (i.e. the data slot in a Seurat object). The vignette for SingleR for example says

The above example uses SummarizedExperiment objects, but the same functions will accept any (log-)normalized expression matrix.

The output of

# example from vignette
pred.grun <- SingleR(test=logcounts(sceG), ref=logcounts(sceM), labels=sceM$label, de.method="wilcox")
table(pred.grun$labels)

and

pred.grun <- SingleR(test=counts(sceG), ref=counts(sceM), labels=sceM$label, de.method="wilcox")
table(pred.grun$labels)

Is actually the same, which makes me think that the results might not depend that much on the input, but I still want to make sure, that I am using the right data to run my analysis.

Any help is much appreciated!

dtm2451 commented 1 year ago

Yup, as you've noted, raw counts and normalized counts produce the same result. Sorry for the confusion, but that'll be why you're seeing either in examples.

It's because both look the same to the rank based correlation metrics used in SingleR. Log normalization changes spacing between values, but the most highly expressed gene of a cell will still have the highest log normalized expression value of that cell. The ranks don't change with log normalization.

So, explicitly confirming: You can use either raw counts or log normalized counts. Either of those are recommended.

nickhir commented 1 year ago

Thanks a lot! That clears up my confusion!