Open komalsrathi opened 3 months ago
I had to manually install this library txdbmaker
That's interesting. I think I ran all the modules on Docker and didn't face any issues. Let me check.
I also reran this with the data.table::fread's tmpdir
fix, luckily doesn't change any output files.
Output created: 03-tp53-cnv-loss-domain.nb.html
03 tp53
processing file: 04-tp53-sv-loss.Rmd
|............ | 24% [unnamed-chunk-2]Error in `load_package_gracefully()`:
! Could not load package txdbmaker. Is it installed?
Note that starting with BioC 3.19, calling makeTxDbFromGFF() requires
the txdbmaker package. Please install it with:
BiocManager::install("txdbmaker")
Backtrace:
1. GenomicFeatures::makeTxDbFromGFF(...)
2. GenomicFeatures:::call_fun_in_txdbmaker("makeTxDbFromGFF", ...)
3. GenomicFeatures:::load_package_gracefully(...)
Quitting from lines 42-74 [unnamed-chunk-2] (04-tp53-sv-loss.Rmd)
Execution halted
I did get the same thing as @naqvia - I think we do need to add this new package to the docker image since we did the 4.4 update.
Will update once #118 is merged.
@komalsrathi this now runs to completion with the new docker image, however I did notice that the sv_overlap_tp53.tsv
file is now empty. This should come from the manta sv file, can you check the script pulling SVs?
Are you talking about this file: https://github.com/d3b-center/hope-cohort-analysis/blob/update-nf1-score/analyses/tp53_nf1_score/results/sv_overlap_tp53.tsv ?
Are you talking about this file: https://github.com/d3b-center/hope-cohort-analysis/blob/update-nf1-score/analyses/tp53_nf1_score/results/sv_overlap_tp53.tsv ?
yes, seems it is empty and affecting some losses
Could be the same data.table issue, checking now.
I deleted my v3 folder -> merged the master branch -> ran download script to get v3 files. As expected, only 1 file fails:
Checking MD5 hashes...
Hope-and-CPTAC-GBM-gene-expression-rsem-tpm-collapsed.rds: OK
Hope-and-CPTAC-GBM.splice-events-rmats.tsv.gz: OK
Hope-GBM-histologies.tsv: FAILED
Hope-methyl-beta-values.rds: OK
Hope-methyl-m-values.rds: OK
release-notes.md: OK
Hope-sv-manta.tsv.gz: OK
Hope-GBM-histologies-base.tsv: OK
Hope-cnv-controlfreec-tumor-only.rds: OK
Hope-cnv-controlfreec.rds: OK
Hope-fusion-putative-oncogenic.rds: OK
Hope-gene-counts-rsem-expected_count-collapsed.rds: OK
Hope-gene-counts-rsem-expected_count.rds: OK
Hope-gene-expression-rsem-tpm-collapsed.rds: OK
Hope-gene-expression-rsem-tpm.rds: OK
Hope-snv-consensus-plus-hotspots.maf.tsv.gz: OK
Hope-tumor-only-snv-mutect2.maf.tsv.gz: OK
md5sum: WARNING: 1 of 17 computed checksums did NOT match
Next, I manually soft-linked v3 files to data folder. Here is my data folder:
(base) ➜ d3b-hope-cohort-analysis git:(update-nf1-score) ls -lt data
total 356736
lrwxr-xr-x 1 rathik 1768498755 30 Aug 21 15:52 Hope-methyl-beta-values.rds -> v3/Hope-methyl-beta-values.rds
lrwxr-xr-x 1 rathik 1768498755 28 Aug 21 15:52 Hope-cnv-controlfreec.rds -> v3/Hope-cnv-controlfreec.rds
lrwxr-xr-x 1 rathik 1768498755 36 Aug 21 15:52 Hope-gene-expression-rsem-tpm.rds -> v3/Hope-gene-expression-rsem-tpm.rds
lrwxr-xr-x 1 rathik 1768498755 39 Aug 21 15:52 Hope-cnv-controlfreec-tumor-only.rds -> v3/Hope-cnv-controlfreec-tumor-only.rds
lrwxr-xr-x 1 rathik 1768498755 41 Aug 21 15:52 Hope-tumor-only-snv-mutect2.maf.tsv.gz -> v3/Hope-tumor-only-snv-mutect2.maf.tsv.gz
lrwxr-xr-x 1 rathik 1768498755 46 Aug 21 15:52 Hope-gene-expression-rsem-tpm-collapsed.rds -> v3/Hope-gene-expression-rsem-tpm-collapsed.rds
lrwxr-xr-x 1 rathik 1768498755 48 Aug 21 15:52 Hope-and-CPTAC-GBM.splice-events-rmats.tsv.gz -> v3/Hope-and-CPTAC-GBM.splice-events-rmats.tsv.gz
lrwxr-xr-x 1 rathik 1768498755 23 Aug 21 15:52 Hope-sv-manta.tsv.gz -> v3/Hope-sv-manta.tsv.gz
lrwxr-xr-x 1 rathik 1768498755 43 Aug 21 15:52 Hope-gene-counts-rsem-expected_count.rds -> v3/Hope-gene-counts-rsem-expected_count.rds
lrwxr-xr-x 1 rathik 1768498755 60 Aug 21 15:52 Hope-and-CPTAC-GBM-gene-expression-rsem-tpm-collapsed.rds -> v3/Hope-and-CPTAC-GBM-gene-expression-rsem-tpm-collapsed.rds
lrwxr-xr-x 1 rathik 1768498755 46 Aug 21 15:52 Hope-snv-consensus-plus-hotspots.maf.tsv.gz -> v3/Hope-snv-consensus-plus-hotspots.maf.tsv.gz
lrwxr-xr-x 1 rathik 1768498755 53 Aug 21 15:52 Hope-gene-counts-rsem-expected_count-collapsed.rds -> v3/Hope-gene-counts-rsem-expected_count-collapsed.rds
lrwxr-xr-x 1 rathik 1768498755 27 Aug 21 15:52 Hope-GBM-histologies.tsv -> v3/Hope-GBM-histologies.tsv
lrwxr-xr-x 1 rathik 1768498755 27 Aug 21 15:52 Hope-methyl-m-values.rds -> v3/Hope-methyl-m-values.rds
lrwxr-xr-x 1 rathik 1768498755 37 Aug 21 15:52 Hope-fusion-putative-oncogenic.rds -> v3/Hope-fusion-putative-oncogenic.rds
lrwxr-xr-x 1 rathik 1768498755 32 Aug 21 15:52 Hope-GBM-histologies-base.tsv -> v3/Hope-GBM-histologies-base.tsv
drwxr-xr-x 20 rathik 1768498755 640 Aug 21 15:47 v3
-rw-r--r-- 1 rathik 1768498755 131420160 Aug 20 19:36 txdb_from_gencode.v39.gtf.db
drwxr-xr-x 21 rathik 1768498755 672 Aug 17 08:40 tmp
drwxr-xr-x 20 rathik 1768498755 640 Aug 16 16:21 v2
-rw-r--r-- 1 rathik 1768498755 47572349 Sep 15 2023 gencode.v39.primary_assembly.annotation.gtf.gz
drwxr-xr-x 20 rathik 1768498755 640 Sep 15 2023 v1
Finally, deleted all results and plots from this module -> reran classifier -> a couple files changed but I don't see the specific file you mentioned changing. Also deleted obsolete results and plots.
Created this branch off of update-merge branch.
Data Preparation
The
tp53_nf1_score
module uses input files fromdata
folder so I manually copied files that I generated bymerge-files
module todata/v3
and then soft-linked those files underdata/
. So, these files point tov3
:And, these point to v2:
Debugging scripts
When I was re-running the module using docker, I got a
ValueError
for theroc_auc_score
function within the06-evaluate-classifier.py
script as follows:So I added a try-catch block and assigned
0
where this error is encountered. This fixed the error but could you check the outputs if it makes sense?Additionally, there is no file
results/gene-expression-rsem-tpm-collapsed-poly-A_classifier_scores.tsv
so I had to remove the command that calls06-evaluate-classifier.py
on this file in therun_classifier.sh
script. I replaced it with the command to run onresults/gene-expression-rsem-tpm-collapsed-poly-A-stranded_classifier_scores.tsv
because that file is present.Output files
There were a lot of files not updated when running the bash script, and I am unsure why:
Similarly, a lot of plots did not get updated as well: