Open mgrantpeters opened 2 years ago
Hi @mgrantpeters
While transcriptional activity indeed affects how the method works, it would be a fair assumption that transcriptionally active cells lead to more RNA in both snRNA and Visium. In our experience, neurones always had more RNA than glial cells (both human and mouse) so your result is surprising. You are not looking at the proportion but at the absolute estimate of cell abundance. Normalising by total per spot doesn't change the ratios between cell types so does not address this in any way. You can always consider spatial distribution for every cell type independently - without comparing which cell types are more or less abundant. There could be technical issues affecting the mapping such as low quality of the Visium data, tissue attachment-induced artefacts, and insufficient granularity of cell annotations (we generally recommend going as granular as possible and 16 clusters sound very broad for postmortem brains).
Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.
Problem
I would like to estimate the proportion of cell types in each spot. I have the generated cell abundance metric, however certain cell types that are more transcriptionally active appear to be overrepresented in my proportions - in the context of brain, this comes across as more glial cells (astrocytes, migrolia) and very few neurons. Your recommendation in the discussion was to _"by taking cell abundance of all cell types (as in the tutorial plotting section), computing total cell abundance per location, and dividing values of individual cell types"_ - I wanted to check I understood this correctly: For a given spot I summed the confidence value for all cell types in a given spot, then took the ratio of the confidence of one cell e.g. B plasma, in relation to this sum (confidence Bplasma/confidence total cell types). Is this the correct approach? If so, does this not guarantee that certain cell types will be overrepresented given the significantly distinct scale bar for each cell types (as made evident in your tutorial)?
N_cells_per_location
anddetection_alpha
.batch_key
for reference NB regression.Description of the data input and hyperparameters
No issues running c2l, just a question relating to interpretation.
post mortem brain samples
Single cell reference data: number of cells, number of cell types, number of genes
snRNAseq (10x 3 prime, approx 16 cell types)
Single cell reference data: technology type (e.g. mix of 10X 3' and 5')
10X 3'
Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)
Visium