AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
100 stars 67 forks source link

Create panels for Figure S2 #1254

Closed jharenza closed 2 years ago

jharenza commented 2 years ago

This figure is not yet represented here and panels should be added per the slide deck here.

Figures used for panels can be found:

Additional question? Do we want to update the plot theme for these supplemental figures to match the main figures? If yes, that should also be done here.

Who will complete this? @sjspielman ?

Update: added one more panel here for comparing Lancet WGS/WXS

sjspielman commented 2 years ago

Got it!

sjspielman commented 2 years ago

Question about setting this up for @jaclyn-taroni - These plots use sql dbs which have to be created in scratch/ and this takes a decent amount of time.

Two options here -

Is there a general preference for one of options, given the scratch/ needs and runtime considerations?

jaclyn-taroni commented 2 years ago

I'd probably do the second option just to make local development (ever so slightly) easier. If you end up running the bash scripts every time you want to make tweaks to the figure, that seems less than ideal. So you'd run the bash scripts in the overall figure generation script as you suggest, but you could then develop whatever you have in figures/scripts with the assumption that those have been run. This is mostly a conceptual difference, because you wouldn't necessarily run the bash scripts to make every tweak, but doing so when submitting a PR on that module does seem like a good approach to development.

sjspielman commented 2 years ago

Noting that I will need to get on AWS to do this, so it will take a little time (~2 weeks is the plan...) before a PR is opened.

sjspielman commented 2 years ago

This is almost ready for a PR, but quick questions for everyone @jharenza @jaclyn-taroni because the current notebook producing the CDF plots (https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/tmb-compare/plots/tmb-cdf-pbta-tcga.png) is very out of date.

IMG_4587.pdf

sjspielman commented 2 years ago

It turns out when rendered as PDF, this figure https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/snv-callers/plots/pbta-caller-comparison/pbta-vaf_cor_matrix.png is 857 MB. Given what it's plotting this size is not actually too surprising, but no matter what it's certainly too big to proceed with.

jharenza commented 2 years ago

It turns out when rendered as PDF, this figure https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/snv-callers/plots/pbta-caller-comparison/pbta-vaf_cor_matrix.png is 857 MB. Given what it's plotting this size is not actually too surprising, but no matter what it's certainly too big to proceed with.

Is there a way to do compression before export from R, or can we use a tiff (if this will be put together in illustrator anyway?)

sjspielman commented 2 years ago

or can we use a tiff

TIFF works!

jaclyn-taroni commented 2 years ago
  • It uses snv-callers/results/consensus/pbta-snv-mutation-tmb-coding.tsv and similar for TCGA. These files no longer exist. As I understand it meandering through PBTA history, these files are now in the data release and named about the same. I am using pbta-snv-consensus-mutation-tmb-coding.tsv and tcga-snv-mutation-tmb-coding.tsv. Look right?

Yes, also consistent with https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/3b5fb8d71daadc9f6c67c62f3b5eef82d2cb466e/doc/data-files-description.md#current-release-release-v21-20210820

  • The current versions of the figures use sea green and lavender -ish colors. Do we want to keep those for the paper, or was that more for exploration and blog post?

Definitely not attached to these colors. If we use cancer_group, we could use the cancer_group colors and then put the adult cancers in a darker grey.

  • The code which produces the figures must have changed since this notebook was run, because it has a different style from the figures! This is specifically in how the strip labels are shown. The figures have vertical labels for short histologies (FYI, short histologies in this plot!), but the code renders horizontal facets. See attached hopefully-clear-enough sketch. I don't think this matters too much especially for an SI figure, but the horizontal labels do mean the figures generally have to be much wider with small font. Getting the code back to vertical might be the way to go because of size?

I think going back to vertical and using cancer_group instead is a good idea.

All of that means maybe we go the figures/scripts route instead of updating the module. But we should probably update the module to use the current code!

sjspielman commented 2 years ago

Closed with #1279