BayraktarLab / cell2location

Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
https://cell2location.readthedocs.io/en/latest/
Apache License 2.0
321 stars 58 forks source link

Reference data: normalized or not #386

Open Dillon214 opened 1 month ago

Dillon214 commented 1 month ago

Hello Cell2location devs,

Small question: does cell2location expect the expression matrix for scRNA reference data to be:

A. Normalized for library size and log-transformed (as is often done in most single-cell RNA analyses)

...or...

B. Unnormalized counts.

If the answer is B, how does cell2location account for discrepencies in library size between cells?

-Dillon

vitkl commented 1 month ago

Hi @Dillon214

It is the option B. The estimated average RNA abundance for every gene and every cell type has to be on a linear counts scale for factorisation of the spatial data to work correctly. You can see the details in the paper methods section and in supplementary methods.