Closed sjspielman closed 2 years ago
@jaclyn-taroni this is ready for a look, if not a full review..
modified: chromothripsis/results/shatterseek_results_per_chromosome.txt
modified: analyses/tp53_nf1_score/input/consensus_seg_with_status.tsv
modified: analyses/tp53_nf1_score/results/pbta-gene-expression-rsem-fpkm-collapsed.polya_classifier_scores.tsv
modified: analyses/tp53_nf1_score/results/pbta-gene-expression-rsem-fpkm-collapsed.stranded_classifier_scores.tsv
modified: analyses/tp53_nf1_score/results/polya_TP53_roc_threshold_results.tsv
modified: analyses/tp53_nf1_score/results/polya_TP53_roc_threshold_results_shuffled.tsv
modified: analyses/tp53_nf1_score/results/stranded_TP53_roc_threshold_results.tsv
modified: analyses/tp53_nf1_score/results/stranded_TP53_roc_threshold_results_shuffled.tsv
modified: analyses/tp53_nf1_score/results/tp53_altered_status.tsv
modified: analyses/tp53_nf1_score/results/tp53_scores_vs_molecular_subtype_Diffuse_astrocytic_and_oligodendroglial_tumor.tsv
modified: analyses/tp53_nf1_score/results/tp53_scores_vs_molecular_subtype_Embryonal_tumor.tsv
modified: analyses/tp53_nf1_score/results/tp53_scores_vs_molecular_subtype_Ependymal_tumor.tsv
modified: analyses/tp53_nf1_score/results/tp53_scores_vs_molecular_subtype_Low-grade_astrocytic_tumor.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_log_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_log_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_log_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_none_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_none_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_none_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_log_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_log_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_log_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_none_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_none_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/kallisto_stranded_none_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_log_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_log_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_log_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_none_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_none_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_polyA_none_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_log_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_log_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_log_umap_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_none_pca_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_none_tsne_scores_aligned.tsv
modified: analyses/transcriptomic-dimension-reduction/results/rsem_stranded_none_umap_scores_aligned.tsv
The tp53_nf1_score
diffs are mostly numerical tolerance, e.g. from analyses/tp53_nf1_score/input/consensus_seg_with_status.tsv
:
# Old line
BS_6GV08HTE chr7 138855546 140791770 NA 0.396711 3 2 gain
# New line
BS_6GV08HTE chr7 138855546 140791770 NA 0.39671100000000004 3 2 gain
But some are not, eg from analyses/tp53_nf1_score/results/pbta-gene-expression-rsem-fpkm-collapsed.polya_classifier_scores.tsv
:
# Old line
BS_0VXZCRJS 0.7193414470702985 0.2390159763881984 0.7694147776801358 0.7145130845110855 0.5351872429322704 0.41300529552370535
# New line
BS_0VXZCRJS 0.7193414470702985 0.2390159763881984 0.7694147776801357 0.4702415413857321 0.6709645060773562 0.43131321436036846
For the transcriptomic reduction, the diffs are kind of all over the place and suggest input data has changed. For example, in analyses/transcriptomic-dimension-reduction/results/kallisto_polyA_log_pca_scores_aligned.tsv
several brand new columns are added into the TSV that weren't there before, but the PCs look the same. Overall this suggests to me this module in general needs to be re-run.
Notably, while this script catches most of the MS modules, a lot of modules (eg molecular subtyping!) relevant to the paper aren't part of the figure generation. I wonder if we might want to have two separate scripts for a Big Run - one to first run all analysis modules in the paper, and then one to just generate the figures. This would move out all module runs from generate-figures.sh
into a new script (analyses/run-analyses.sh
? figures/prepare-analyses.sh
?)
Going more carefully through some of these diffs now, I am concerned about what I'm seeing with the tp53_nf1_score
module. The ROC curves themselves have changed, as well as some tp53 expression -
https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/stranded_TP53_roc.png
https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/tp53_expression_by_altered_status_stranded.png
https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/tp53_scores_by_altered_status.png
Going more carefully through some of these diffs now, I am concerned about what I'm seeing with the
tp53_nf1_score
module. The ROC curves themselves have changed, as well as some tp53 expression - https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/stranded_TP53_roc.png https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/tp53_expression_by_altered_status_stranded.png https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/e635974914a0de892cc537b9bd5be79e0b464191/analyses/tp53_nf1_score/plots/tp53_scores_by_altered_status.png
@sjspielman I found an issue with the TP53 module here, so this may be related.
I'm breaking out rerunning transcriptomic-dimension-reduction
into its own pull request in the interest of examining that, but I'll think on this point:
I wonder if we might want to have two separate scripts for a Big Run - one to first run all analysis modules in the paper, and then one to just generate the figures. This would move out all module runs from
generate-figures.sh
into a new script (analyses/run-analyses.sh
?figures/prepare-analyses.sh
?)
Closing in favor of #1454
This PR addresses Issue #1261 and re-organizes
figures/generate_figures.sh
in order of figures as they will appear in the manuscript.Deprecated items were removed from the generation script, and figures missing from the script were added in.