bespin-workflows / exomeseq-gatk4

Whole Exome Sequencing in CWL using GATK4
MIT License
0 stars 2 forks source link

Concern over DP annotation used in VariantRecalibration #11

Closed johnbradley closed 4 years ago

johnbradley commented 5 years ago

When working on a fix for bespin-workflows/exomseq-gatk3 I came across documentation that says the DP annotation should not be used with exome datasets. I think this documentation still applies to GATK4. https://gatkforums.broadinstitute.org/gatk/discussion/1259/which-training-sets-arguments-should-i-use-for-running-vqsr

Depth of coverage (the DP annotation invoked by Coverage) should not be used when working with exome datasets since there is extreme variation in the depth to which targets are captured! In whole genome experiments this variation is indicative of error but that is not the case in capture experiments.

Code where we are currently specifying DP: https://github.com/bespin-workflows/exomeseq-gatk4/blob/d7900c7cbccbbae0a1bd7d44005855170b1eb1ea/subworkflows/exomeseq-gatk4-02-variantdiscovery.cwl#L135-L142

https://github.com/bespin-workflows/exomeseq-gatk4/blob/d7900c7cbccbbae0a1bd7d44005855170b1eb1ea/subworkflows/exomeseq-gatk4-02-variantdiscovery.cwl#L143-L150

dleehr commented 5 years ago

I discussed with Joey today, and he recommended that the DP annotation should not be used here.