LieberInstitute / spatialDLPFC

spatialDLPFC project involving Visium (n = 30), Visium SPG (n = 4) and snRNA-seq (n = 19) samples
http://research.libd.org/spatialDLPFC/
16 stars 3 forks source link

Explore deconvolution marker genes (mean ratio) #127

Closed lcolladotor closed 1 year ago

lcolladotor commented 2 years ago

After identifying the mean ratio marker genes at https://github.com/LieberInstitute/DLPFC_snRNAseq/issues/7#issuecomment-1232064775 for both the broad cell type and layer level resolutions, we want to explore them. This exploration will help us decide whether we can use the top 25 mean ratio genes for each cluster/group or if some, we have to use fewer of them.

In the past, we decided some of this by looking at plots like https://speakerdeck.com/lcolladotor/psychgenomics-2022?slide=24 that show the mean ratio on the x-axis and the log fold change from the Tran et al 1 vs all strategy on the y-axis. So with that in mind, once https://github.com/LieberInstitute/DLPFC_snRNAseq/issues/7 is done, we can make those same scatterplots again. @lahuuki can either do this or point you to code for this part. Maybe it's at https://github.com/LieberInstitute/deconvolution_bsp2 but Louise can verify where it's located at.

Another set of plots is to make a PDF with 1 page per marker gene and make violin plots like the ones at https://speakerdeck.com/lcolladotor/psychgenomics-2022?slide=22. Louise has updated code that makes plots like https://github.com/LieberInstitute/DLPFC_snRNAseq/blob/main/plots/05_explore_sce/06_explore_azimuth_annotations/Azimuth_basic_mathys_markers.pdf. Her code doesn't show a point for every nuclei, unlike the ones you see on the slide deck. So they are faster to make and open. These plots would help us check visually how it's doing.

A third plot is to explore the mean ratio metric by itself, so the x-axis from the first scatter plots. We could make boxplots of the mean ratio on the y-axis, with the x-axis being the different clusters/groups. This will help us see if the mean ratio is lower for a particular cluster/group and could narrow down which of the violin plots we want to focus on more.

Screen Shot 2022-08-30 at 5 02 02 PM

Summary:

For both the broad cell type and layer level resolutions make: