Gleeson-Lab / wxs_pipeline

Starting with BAMs and FASTQs, follow GATK 4.0 Best Practices up to generating a joint-genotyped VCF
1 stars 1 forks source link

Change BED File for WES #7

Open brcopeland opened 2 years ago

brcopeland commented 2 years ago

This is used for getting coverage information and is currently set to just be the CCDS regions. Hence this ignores all additional capture targets.

shishenyxx commented 2 years ago

Alternatively, based on the depth QC, we can generate a separated bed region with say average depth >5 or something, and use this in the follow-up variant annotation and filtering and variant calling ... Not sure whether this is much faster than just looking at whole-genome and filter out low depth reads though ...

brcopeland commented 2 years ago

Do you think this refinement would improve the quality of the data produced, and if so, how? mosdepth and similar tools I've used previously (e.g. gatk DepthOfCoverage) do not by default report per-base coverage. That would necessitate substantially larger files.

shishenyxx commented 2 years ago

That's true ... for DeepMosaic input just a rough estimation for the region should be enough ... Strelka2 only needs the coverage bed if I remember correctly ... just a little bit faster ... maybe just use the whole-genome bed ... it will only cause issues when each bed region has a very different depth performance ... (like for WES/AmpliSeq the catpure/amplification efficiency for each region is different) ... I haven't really look into this though ... In MosaicHunter we just made a model for everything ....