carmonalab / UCell

Gene set scoring for single-cell data
GNU General Public License v3.0
135 stars 16 forks source link

For Seurat, must the data be scaled and/or normalized before using AddModuleScore_UCell? #13

Closed ksaunders73 closed 2 years ago

ksaunders73 commented 2 years ago

Hello!

In Seurat, you can both scale and normalize the data with ScaleData() and NormalizeData() respectively.

From issue 3892 on Seurat's github, it is implied that ScaleData() is mainly just for PCA. This makes me wonder, should the Seurat object be normalized, or scaled, or both, before running AddModuleScoreUCell()?

Thank you for reading!

mass-a commented 2 years ago

Hello Kaytlin,

short answer: no, you do not need to normalize or rescale the data to apply UCell. UCell scores are calculated for each cell, and are based only on the relative ranking of the genes in the signature. If you apply a normalization that preserves the ranking of the genes (e.g. a log-normalization), you can verify that you obtain exactly the same UCell scores before or after normalization.

As for scaling across genes, I agree that it should not be used outside of PCA pre-processing.

Best, -m

ksaunders73 commented 2 years ago

Thank you very much @mass-a! This is very helpful!