mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
https://cmotempo.netlify.com/
12 stars 5 forks source link

Implement GATK4/Picard CollectHSMetrics for WES only #438

Closed evanbiederstedt closed 4 years ago

evanbiederstedt commented 5 years ago

I think we decided today to implement CollectHSMetrics from GATK4/Picard (they're the same tool nowadays).

https://github.com/mskcc/vaporware/issues/389

gatk CollectHsMetrics \
      I=input_reads.bam \
      O=output_hs_metrics.txt \
      R=reference.fasta \
      BAIT_INTERVALS=bait.interval_list \
      TARGET_INTERVALS=target.interval_list

https://software.broadinstitute.org/gatk/documentation/tooldocs/4.0.5.1/picard_analysis_directed_CollectHsMetrics.php

We only do this for WES, not WGS. Here are the bait files necessary, which I got from @kpjonsson

https://github.com/mskcc/vaporware/blob/feature/GATKCollectMetrics/conf/references.config#L51-L58

idtTargets = "${params.reference_base}/mskcc-igenomes/grch37/targets/IDT_Exome_v1_FP_b37_targets.plus5bp.bed.gz"
idtTargetsIndex = "${idtTargets}.tbi"
idtBaits = "${params.reference_base}/mskcc-igenomes/grch37/baits/IDT_Exome_v1_FP_b37_baits.bed.gz"
idtBaitsIndex = "${idtBaits}.tbi"
agilentTargets = "${params.reference_base}/mskcc-igenomes/grch37/targets/AgilentExon_51MB_b37_v3_targets.plus5bp.bed.gz"
agilentTargetsIndex = "${agilentTargets}.tbi"
agilentBaits = "${params.reference_base}/mskcc-igenomes/grch37/baits/AgilentExon_51MB_b37_v3_baits.bed.gz"
agilentBaitsIndex = "${agilentBaits}.tbi"
evanbiederstedt commented 5 years ago

https://github.com/mskcc/vaporware/pull/453