Interpretation of the deconvolution results

Zhuang-Bio commented 2 years ago

Interpretation of the deconvolution results

[x] I follow the instructions from the cell2location tutorial (using on scvi-tools).
[x] I have adjusted required hyperparameters to my dataset and tissue N_cells_per_location and detection_alpha.
[x] I have provided 10X reaction/inlet as batch_key for reference NB regression.
[x] I have checked scverse Discourse and old Cell2location Community Forum, and did not find a solution.

Description of the data input and hyperparameters

Single cell reference data: number of cells, number of cell types, number of genes

~20000 cell, 26 clusters, ~5000 marker genes ...

Single cell reference data: technology type (e.g. mix of 10X 3' and 5')

10X3' v3 ...

Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)

Visium

Question

Hej, Thank you very much for this awesome tool. I just followed the pipeline and everything went very well. In terms of interpreting the results, I know it is better to use the 'q05_cell_abundance_w_sf' as deconvolution results to project into the H&E images. But what does the values of confident cell abundance of each cell type means? I summed up those values in each row, but the total is quite different which makes me a little bit confused. More specifically, I have 26 clusters as below, I wonder if it is reasonable that I can calculate the relative cell type proportion for each spot (e.g. relative proportion of Basal-I in spot 1 = 0.0788311/5.16249044)? That may be easy for us to understand if the total amount of each spot equals to 1, we could compare the results of different cell types directly.

cell2location results Spatial_SpotID | Basal-I | Basal-II | Basal-III | DC | DC-LAMP | Dermal-DC | FB-I | FB-II | FB-III | FB-IV | Granular-I | Granular-II | LC | LE | MEL | Mast-cell | Mono-Mac | NK-cell | PC-vSMC | Plasma-B-cell | Schwann | Spinous-I | Spinous-II | Spinous-III | Th | VE | Total -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- AAACAAGTATCTCCCA-1 | 0.0788311 | 0.01183488 | 0.0013773 | 0.0066458 | 0.00330211 | 0.00185754 | 0.82047006 | 0.06403197 | 3.31677738 | 0.00127522 | 0.00188847 | 0.02291755 | 0.10469464 | 0.12440987 | 0.02615497 | 0.05246331 | 0.04863389 | 0.01683048 | 0.03649302 | 0.00685662 | 0.00687596 | 0.2401821 | 0.10762781 | 0.0053093 | 0.03440755 | 0.02034153 | 5.16249044 AAACATTTCCCGGATT-1 | 0.00802813 | 0.00075149 | 0.0022309 | 0.00468953 | 0.00162133 | 0.00118663 | 0.31470423 | 0.02221218 | 1.92233016 | 0.00103692 | 0.00336535 | 0.00105805 | 0.0251716 | 0.01604621 | 0.02824159 | 0.06488496 | 0.02648281 | 0.00788521 | 0.07540033 | 0.00343071 | 0.02093353 | 0.00931369 | 0.00423436 | 0.00128129 | 0.01348818 | 0.00388846 | 2.58389782 AAACCTAAGCAGCCGG-1 | 0.11836652 | 0.01075403 | 0.02144423 | 0.10995266 | 0.09720989 | 0.05960494 | 0.00045867 | 0.43426276 | 0.01384732 | 0.02475756 | 0.00795721 | 0.00277988 | 0.31790655 | 0.21364005 | 1.84851403 | 0.09111738 | 0.12857315 | 0.14189067 | 1.42056299 | 0.1412995 | 0.54924091 | 0.13600109 | 0.12423333 | 5.46599512 | 0.66547016 | 0.68631223 | 12.8321528 AAACGAGACGGTTGAT-1 | 0.04039945 | 0.00090433 | 0.00193628 | 0.00361877 | 0.00120299 | 0.00109853 | 0.18939618 | 0.03465241 | 3.35178006 | 0.00105741 | 0.00070539 | 0.05037835 | 0.04513825 | 0.00694101 | 0.02586368 | 0.0677819 | 0.02174506 | 0.01301383 | 0.00370227 | 0.00310089 | 0.00597889 | 0.06166198 | 0.01398668 | 0.0010241 | 0.01857916 | 0.00215477 | 3.96780262

That's something like using the Stereoscope for the deconvolution. <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Stereoscope results Spot_ID | B.cell | Basal.I | Basal.II | Basal.III | DC | FB.I | FB.II | FB.III | FB.IV | Granular.I | Granular.II | LC | LE | Mac | Mast.cell | MEL | Mono.DC | Mono.Mac | NK.cell | PC.vSMC | Schwann | Spinous.I | Spinous.II | Spinous.III | Th | VE | Total -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- AAACAAGTATCTCCCA-1 | 0.000000752 | 0.000000838 | 0.000001113 | 0.000001819 | 0.000000642 | 0.510354700 | 0.000000610 | 0.237905950 | 0.000002362 | 0.062628990 | 0.007137010 | 0.000000480 | 0.017056220 | 0.059652492 | 0.000000186 | 0.105165580 | 0.000081452 | 0.000001618 | 0.000000542 | 0.000000234 | 0.000001415 | 0.000000557 | 0.000000925 | 0.000001599 | 0.000000716 | 0.000001369 | 1.00 AAACATTTCCCGGATT-1 | 0.000000579 | 0.010616324 | 0.000000520 | 0.000002530 | 0.023446321 | 0.352238830 | 0.000001269 | 0.426972480 | 0.000000769 | 0.000000404 | 0.000000312 | 0.000000608 | 0.000000417 | 0.000000848 | 0.011878287 | 0.134177450 | 0.000000730 | 0.000001426 | 0.000000382 | 0.040651760 | 0.000001584 | 0.000000236 | 0.000000308 | 0.000000356 | 0.000000377 | 0.000004968 | 1.00 AAACCTAAGCAGCCGG-1 | 0.000000463 | 0.000000383 | 0.000000227 | 0.000001531 | 0.000000135 | 0.000000080 | 0.044569973 | 0.000000239 | 0.000000166 | 0.000000201 | 0.000000698 | 0.000000181 | 0.000000206 | 0.000000749 | 0.006253038 | 0.398368750 | 0.000000274 | 0.000000175 | 0.000005720 | 0.083124240 | 0.000021626 | 0.000000115 | 0.000000109 | 0.000000290 | 0.000004656 | 0.467645820 | 1.00

Thank you in advance!

Cheers, Zhuang

yxiaobme commented 2 years ago

I have the same questions! Can the abundance values of each cell type be compared directly ? What exactly do those values represent?

vitkl commented 2 years ago

Hi @Zhuang-Bio @yxiaobme

'q05_cell_abundance_w_sf' deconvolution result can be interpreted as cell abundance or cell density per location: expected number of cells, including fractional values for incompletely captured cells.

In contrast to "cell proportions" per location, using cell abundance doesn't assume that the total number of cells is identical across locations. Normalising to the total per spot doesn't simplify comparisons of cell abundance between cell types in any way. A caveat is that both of these measures are approximate and the spatial distribution of a cell type is more robust than both cell abundance or cell proportion differences between cell types. We generally see that highly abundant cell types indeed have higher 'q05_cell_abundance_w_sf', for example Naive CD4 T cells in tutorial more abundant than FDC cells, however, any differences in abundance between cell types need to be validated with other technogies.

Zhuang-Bio commented 2 years ago

@vitkl Thanks for your clear explanation.

pumpkin-hat commented 1 year ago

@vitkl Hi Thanks for your explanation. Is there any cutoff for 'q05_cell_abundance_w_sf' to fitlter the table to keep only high-quality result? I got the annotation but many cell's abundance is very low(less than 0.04).Looking forward to your reply.

BayraktarLab / cell2location