Closed jharenza closed 2 years ago
Got it!
Question about setting this up for @jaclyn-taroni - These plots use sql dbs which have to be created in scratch/
and this takes a decent amount of time.
Two options here -
bash run_caller_consensus_analysis-pbta.sh
and bash run_caller_consensus_analysis-tcga.sh
, and write updated plotting code in a script in figures/
Is there a general preference for one of options, given the scratch/
needs and runtime considerations?
I'd probably do the second option just to make local development (ever so slightly) easier. If you end up running the bash scripts every time you want to make tweaks to the figure, that seems less than ideal. So you'd run the bash scripts in the overall figure generation script as you suggest, but you could then develop whatever you have in figures/scripts
with the assumption that those have been run. This is mostly a conceptual difference, because you wouldn't necessarily run the bash scripts to make every tweak, but doing so when submitting a PR on that module does seem like a good approach to development.
Noting that I will need to get on AWS to do this, so it will take a little time (~2 weeks is the plan...) before a PR is opened.
This is almost ready for a PR, but quick questions for everyone @jharenza @jaclyn-taroni because the current notebook producing the CDF plots (https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/tmb-compare/plots/tmb-cdf-pbta-tcga.png) is very out of date.
snv-callers/results/consensus/pbta-snv-mutation-tmb-coding.tsv
and similar for TCGA. These files no longer exist. As I understand it meandering through PBTA history, these files are now in the data release and named about the same. I am using pbta-snv-consensus-mutation-tmb-coding.tsv
and tcga-snv-mutation-tmb-coding.tsv
. Look right?It turns out when rendered as PDF, this figure https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/snv-callers/plots/pbta-caller-comparison/pbta-vaf_cor_matrix.png is 857 MB. Given what it's plotting this size is not actually too surprising, but no matter what it's certainly too big to proceed with.
It turns out when rendered as PDF, this figure https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/master/analyses/snv-callers/plots/pbta-caller-comparison/pbta-vaf_cor_matrix.png is 857 MB. Given what it's plotting this size is not actually too surprising, but no matter what it's certainly too big to proceed with.
Is there a way to do compression before export from R, or can we use a tiff (if this will be put together in illustrator anyway?)
or can we use a tiff
TIFF works!
- It uses
snv-callers/results/consensus/pbta-snv-mutation-tmb-coding.tsv
and similar for TCGA. These files no longer exist. As I understand it meandering through PBTA history, these files are now in the data release and named about the same. I am usingpbta-snv-consensus-mutation-tmb-coding.tsv
andtcga-snv-mutation-tmb-coding.tsv
. Look right?
Yes, also consistent with https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/3b5fb8d71daadc9f6c67c62f3b5eef82d2cb466e/doc/data-files-description.md#current-release-release-v21-20210820
- The current versions of the figures use sea green and lavender -ish colors. Do we want to keep those for the paper, or was that more for exploration and blog post?
Definitely not attached to these colors. If we use cancer_group
, we could use the cancer_group
colors and then put the adult cancers in a darker grey.
- The code which produces the figures must have changed since this notebook was run, because it has a different style from the figures! This is specifically in how the strip labels are shown. The figures have vertical labels for short histologies (FYI, short histologies in this plot!), but the code renders horizontal facets. See attached hopefully-clear-enough sketch. I don't think this matters too much especially for an SI figure, but the horizontal labels do mean the figures generally have to be much wider with small font. Getting the code back to vertical might be the way to go because of size?
I think going back to vertical and using cancer_group
instead is a good idea.
All of that means maybe we go the figures/scripts
route instead of updating the module. But we should probably update the module to use the current code!
Closed with #1279
This figure is not yet represented here and panels should be added per the slide deck here.
Figures used for panels can be found:
Additional question? Do we want to update the plot theme for these supplemental figures to match the main figures? If yes, that should also be done here.
Who will complete this? @sjspielman ?
Update: added one more panel here for comparing Lancet WGS/WXS