Compare spot-deconvolution vs IF cell counts

lcolladotor commented 2 years ago

This will be done only with the n = 4 Visium IF samples. It will depend on results from several other issues https://github.com/LieberInstitute/spatialDLPFC/issues?q=is%3Aissue+is%3Aopen+label%3Aspot-deconvolution

We still need to think about how exactly this comparison will be done and the main figure resulting from it (which will be a panel in one of the main figures of the paper) plus potentially some supplementary figures

lcolladotor commented 2 years ago

Louise and I could help with ideas, but now @Nick-Eagles will be doing this.

lcolladotor commented 2 years ago

One option is to make scatterplots comparing in the x-axis the proportion observed in the IF data vs the proportion estimated from the spot deconvolution results (from #128). That would be like https://speakerdeck.com/lcolladotor/psychgenomics-2022?slide=36 or the slide after it. However, with only 4 points (4 Visium IF samples) per cell type for a particular deconvolution method, it's not a lot to see whether things are "closer to the diagonal" or not.

We could compute a RMSE (root mean squared error) between the observed (spot deconvolution results) vs the expected (proportions from the IF part). Though well, that RSME again would be based data from 4 points from the scatterplot above.

At the spot level, we can check whether the proportion from the broad cell type resolution seems to match the proportion from the layer-level resolution (combining the results from the different Excit_Lxx results). That would be scatterplots paneled by sample (since we have 4) with lots of points (since we have up to about 4k or so spots per sample). That evaluates just the consistency of the results and would be similar to the comparison at https://speakerdeck.com/lcolladotor/psychgenomics-2022?slide=36 when we changed the number of marker genes.

Spatially, we could know from the H&E staining where the GM vs WM boundary is as well as the orientation of L1 through L6, so for each method we could plot the number of cells (one cell type at a time) spatially and see if the spatial pattern is "better" for one vs another deconvolution method. We can do that with http://research.libd.org/spatialLIBD/reference/vis_grid_gene.html.

We likely need to think more about this.

Summary:

[x] scatterplots for each cell type and each deconvolution method between the IF proportions and the deconvolution proportions
[x] scatterplots for each cell type and each deconvolution method at the spot level (paneled by sample) between the broad cell type and layer-level spot deconvolution results.
[x] vis_grid_gene plots of the deconvolution results for each cell type and for each deconvolution method.

lcolladotor commented 2 years ago

We talked about how for cell2location and Tangram we want to compare the total counts (or total abundance) vs the IF total counts, doing a scatterplot at the spot level and paneling by the 4 Visium-IF samples.

Then we'll also make scatterplots for each cell type, paneled by sample, for the count and the proportion.

lcolladotor commented 2 years ago

Per cell type, scatterplot of the mean/median expression for the up to 25 mean ratio marker genes vs the proportion or count for that cell type; paneled by sample. Annotate with correlation.

We would expect to see a positive association between the 2 variables.

Could color each spot by the RSME vs IF.

lcolladotor commented 2 years ago

vis_grid_gene() plot of the RSMEs

Nick-Eagles commented 1 year ago

I've produced these plots and others that we've discussed here and here. Closing because although we may continue to tweak plots slightly, all the fundamental IF plots should be complete now.

LieberInstitute / spatialDLPFC

Compare spot-deconvolution vs IF cell counts #99