AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
99 stars 66 forks source link

add germline tools to KR table/update table 2 #1577

Closed jharenza closed 2 years ago

jharenza commented 2 years ago

Purpose/implementation Section

What scientific question is your analysis addressing?

Adding germline annotation tools to the KR table

What was your approach?

What GitHub issue does your pull request address?

1564

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Is there anything that you want to discuss further?

Note: for these, since they were run by an external lab and on an internal HPC, the workflows/scripts are not going to be available, so I linked out to the actual code's exact versions. For ANNOVAR, you have to register to download the software, so I could only link to the release notes page and add the date of release.

I also saw duplicate vardict versions, so removed one here.

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

What is your summary of the results?

Reproducibility Checklist

Documentation Checklist

jharenza commented 2 years ago

This PR will also close https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1586

jharenza commented 2 years ago

Also closes #1591

jharenza commented 2 years ago

Per Sharon's request, here I also add the actual germline variants to the table.

jharenza commented 2 years ago

FYI many of the comments I left are just to double check and extra-confirm what I suspect is right about the workflow versions.

I also took a moment to look for any "latest" tags in docker pull for the workflows, and I'm seeing a couple items that are versioned in this table. Let's also use this PR to confirm the versions we are recording match the versions that were actually used when the workflows ran. It also might be a good idea on your end to make sure the workflows README matches what is in the code.


(base) sjs-ccdl :: pbta/OpenPBTA-workflows/cwl ‹master› » grep "dockerPull"  * | grep "latest" | uniq

annotsv.cwl:  dockerPull: gaonkark/annotsv:latest

annovar_20190319.cwl:  dockerPull: kfdrc/annovar:latest

clinvar_pathogenic_filter.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/bvcftools:latest

fusion_annotator.cwl:    dockerPull: 'pgc-images.sbgenomics.com/d3b-bixu/fusionanno:latest'

kfdrc-alignment-cram-only-wf.cwl:      dockerPull: kfdrc/picard-r:latest-dev

kfdrc-alignment-fq-input.cwl:      dockerPull: kfdrc/picard-r:latest-dev

kfdrc-alignment-fqi-nput-cram-only-wf.cwl:          dockerPull: 'kfdrc/picard-r:latest-dev'

kfdrc-alignment-wf.cwl:      dockerPull: kfdrc/picard-r:latest-dev

kfdrc-mutect2_strelka2-wf.cwl:          dockerPull: 'kfdrc/bvcftools:latest'

kfdrc_RNAseq_workflow.cwl:          dockerPull: 'kfdrc/cutadapt:latest'

kfdrc_RNAseq_workflow.cwl:          dockerPull: 'gcr.io/broad-cga-aarong-gtex/rnaseqc:latest'

kfdrc_RNAseq_workflow.cwl:          dockerPull: 'kfdrc/star:latest'

kfdrc_annot_vcf_sub_wf.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/vcfutils:latest

kfdrc_annot_vcf_sub_wf.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/bvcftools:latest

kfdrc_combined_somatic_wgs_cnv_wf.cwl:          dockerPull: 'kfdrc/bvcftools:latest'

kfdrc_strelka2_mutect2_manta_workflow.cwl:          dockerPull: 'kfdrc/bvcftools:latest'

@migbro can you check these versions for us please? Also a good fyi for updating your cwls

zhangb1 commented 2 years ago
annotsv.cwl:  dockerPull: gaonkark/annotsv:latest   AnnotSV_2.1
annovar_20190319.cwl:  dockerPull: kfdrc/annovar:latest annovar :2018-04-16 00:47:49 -0400 (Mon, 16 Apr 2018)
clinvar_pathogenic_filter.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/bvcftools:latest bcftools 1.7
fusion_annotator.cwl:    dockerPull: 'pgc-images.sbgenomics.com/d3b-bixu/fusionanno:latest' no versioning
kfdrc-alignment-cram-only-wf.cwl:      dockerPull: kfdrc/picard-r:latest-dev    picard 2.18.2-SNAPSHOT
kfdrc-alignment-fq-input.cwl:      dockerPull: kfdrc/picard-r:latest-dev    picard 2.18.2-SNAPSHOT
kfdrc-alignment-fqi-nput-cram-only-wf.cwl:          dockerPull: 'kfdrc/picard-r:latest-dev' picard 2.18.2-SNAPSHOT
kfdrc-alignment-wf.cwl:      dockerPull: kfdrc/picard-r:latest-dev  picard 2.18.2-SNAPSHOT
kfdrc-mutect2_strelka2-wf.cwl:          dockerPull: 'kfdrc/bvcftools:latest'    bcftools 1.7
kfdrc_RNAseq_workflow.cwl:          dockerPull: 'kfdrc/cutadapt:latest' cutadapt 2.5
kfdrc_RNAseq_workflow.cwl:          dockerPull: 'gcr.io/broad-cga-aarong-gtex/rnaseqc:latest'   RNASeQC 2.3.2
kfdrc_RNAseq_workflow.cwl:          dockerPull: 'kfdrc/star:latest' STAR_2.6.1d
kfdrc_annot_vcf_sub_wf.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/vcfutils:latest VCFtools (0.1.15)
kfdrc_annot_vcf_sub_wf.cwl:      dockerPull: pgc-images.sbgenomics.com/d3b-bixu/bvcftools:latest    bcftools 1.7
kfdrc_combined_somatic_wgs_cnv_wf.cwl:          dockerPull: 'kfdrc/bvcftools:latest'    bcftools 1.7
kfdrc_strelka2_mutect2_manta_workflow.cwl:          dockerPull: 'kfdrc/bvcftools:latest'    bcftools 1.7

@jharenza Miguel is OOO starting today and next week, I pull the information from the docker and have the version above.

jharenza commented 2 years ago

Thank you so much for the super quick response @zhangb1 !

jharenza commented 2 years ago

@sjspielman this can be re-reviewed. I added VCFtools, cutadapt, and updated RNA-SeQC

sjspielman commented 2 years ago

I added VCFtools, cutadapt, and updated RNA-SeQC

I see these versions added in, but I don't know how to cross-reference to review them and confirm versions since I only have access to the workflows which say "latest" still. Edit, nevermind it's here! https://github.com/AlexsLemonade/OpenPBTA-analysis/pull/1577#issuecomment-1206606483

jharenza commented 2 years ago

@sjspielman I am going to go back to RNA-SeQC and update per message from Bp:

my mistake, I checked the pipeline we used for PBTA, should be
RNA-SeQC v2.3.4 Generate metrics such as gene and transcript counts, sense/antisene mapping, mapping rates, etc

[3:21](https://d3b.slack.com/archives/DHC31US86/p1659727283707229)
not 2.4.2 ....
2.4.2 is the latest in docker