BayraktarLab / cell2location

Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
https://cell2location.readthedocs.io/en/latest/
Apache License 2.0
320 stars 58 forks source link

Cell type proportion per spot overrepresenting transcriptionally active cell types? #197

Open mgrantpeters opened 2 years ago

mgrantpeters commented 2 years ago

Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.

Problem

I would like to estimate the proportion of cell types in each spot. I have the generated cell abundance metric, however certain cell types that are more transcriptionally active appear to be overrepresented in my proportions - in the context of brain, this comes across as more glial cells (astrocytes, migrolia) and very few neurons. Your recommendation in the discussion was to _"by taking cell abundance of all cell types (as in the tutorial plotting section), computing total cell abundance per location, and dividing values of individual cell types"_ - I wanted to check I understood this correctly: For a given spot I summed the confidence value for all cell types in a given spot, then took the ratio of the confidence of one cell e.g. B plasma, in relation to this sum (confidence Bplasma/confidence total cell types). Is this the correct approach? If so, does this not guarantee that certain cell types will be overrepresented given the significantly distinct scale bar for each cell types (as made evident in your tutorial)?

Description of the data input and hyperparameters

No issues running c2l, just a question relating to interpretation.

post mortem brain samples

Single cell reference data: number of cells, number of cell types, number of genes

snRNAseq (10x 3 prime, approx 16 cell types)

Single cell reference data: technology type (e.g. mix of 10X 3' and 5')

10X 3'

Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)

Visium

vitkl commented 2 years ago

Hi @mgrantpeters

While transcriptional activity indeed affects how the method works, it would be a fair assumption that transcriptionally active cells lead to more RNA in both snRNA and Visium. In our experience, neurones always had more RNA than glial cells (both human and mouse) so your result is surprising. You are not looking at the proportion but at the absolute estimate of cell abundance. Normalising by total per spot doesn't change the ratios between cell types so does not address this in any way. You can always consider spatial distribution for every cell type independently - without comparing which cell types are more or less abundant. There could be technical issues affecting the mapping such as low quality of the Visium data, tissue attachment-induced artefacts, and insufficient granularity of cell annotations (we generally recommend going as granular as possible and 16 clusters sound very broad for postmortem brains).