nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
407 stars 415 forks source link

Target Coverage in BAMQC #96

Closed maxulysse closed 4 years ago

maxulysse commented 4 years ago

Issue by @apeltzer, moved from SciLifeLab#670

Is your feature request related to a problem? Please describe.

The target coverage is computed in BAMQC based on the entire genome. For exome data (even with specified BED file and therefore regions, Sarek doesn't specify the coverage on covered sites but instead the overall coverage on the entire genome.

Describe the solution you'd like

Use the --gff switch in QualiMap2 to run with the specified BED file. That provides more accurate coverage on target capture coverage.

Comments: @szilvajuhos

Cool, other groups were also requesting something similar, I already made some prototype, we can also add the QualiMap2 part. Will work on that.

@apeltzer

Just make sure to use a more updated QualiMap2 version. The possibility to use BED-3 instead of BED-6 format, was just introduced fairly recently (upon my request... https://bitbucket.org/kokonech/qualimap/commits/all ). I don't know which version of QualiMap2 is shipped with the current Sarek container(s), so we should make sure that this works :-)

skrakau commented 4 years ago

Isn't this solved already?

maxulysse commented 4 years ago

Unfortunately not yet, I'll add that to my more urgent pile

skrakau commented 4 years ago

is the information not specified with use_bed = params.targetBED ? "-gff ${targetBED}" in the BamQC process?

maxulysse commented 4 years ago

Oooh yes, you're right, I must have done that within the Scilifelab repo, and forgot to close the issue... I must have copy the issue over and did not check if it was done or not... :man_facepalming: