Closed jaclyn-taroni closed 3 years ago
I don't think we have a high enough N for this, as we don't have a high enough N for the broad histology groups. I wish! :) but that is why I stuck with the plot of all cancers together, because we do see expected co-occurrences in that plot we can describe.
Was thinking about Low-grade glioma astrocytoma
, Medulloblastoma
, High-grade glioma astrocytoma
, and maybe Ependymoma
, and Diffuse midline glioma
(based on numbers here: https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1174#issuecomment-915372913) but you're right the other plots in analyses/interaction-plots/plots/
do look bleak re: N. (I neglected to take a look at those before filing.)
Okay happy to close this, I am intrigued about ggpattern
and the possibility of using it to add on molecular_subtype
info somewhere.
Context & idea
Right now the output of
interaction-plots
looks like this: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/4240cc64ffcce2607717be63ab32acb81cbd299b/analyses/interaction-plots/plots/combined_top50.pngThe most important point I want to make about that plot is that all samples are combined in this panel and it uses the top 50 genes (with some FLAGS filtering, if I recall correctly). There was originally an idea to use a different (more curated) gene list but plot all samples together (#1001). My idea with this issue is to go in another direction entirely: split up the interaction plots by
cancer_group
.We ended up including
cancer_group
in part because of #917. I'll quote from the initial post on that issue:One concern about using gene lists with the interaction plots is that we'd end up replicating a lot of the same information that's in the oncoprints (https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1001#issuecomment-819547278).
Splitting up by
cancer_group
will allow us to add information over and above the oncoprints we expect to include in the main text – namely, we can includemolecular_subtype
(orharmonized_diagnosis
if deemed more appropriate) as I'll show in my sketch below – regardless of whether we use a "top n" or gene list approach.Then we could assemble the bar plot-tile plot pairs for individual
cancer_group
into a multipanel figure, which will likely end up as a single panel in a multipanel figure.Sketch of idea
Big thing to note is the use of cross-hatching to indicate whatever narrower category we'd like to use (e.g.,
molecular_subtype
). I think we can do this withggpattern
.Next steps
Before I invest time in this, I thought I'd get input from others.