nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
397 stars 405 forks source link

Create a sentieon specific subworkflow for DSL2 #405

Open maxulysse opened 3 years ago

maxulysse commented 3 years ago

EDIT (from @cjfields): Modules checklist:

cjfields commented 2 years ago

Checklist (some are already in place)

Modules

FriederikeHanssen commented 2 years ago

BQSR would be done also with a Sentieon subtool or GATK?

maxulysse commented 2 years ago

Sentieon has that if I remember well

cjfields commented 2 years ago

Sentieon has that if I remember well

Correct; it also supports VQSR though I'm not sure if that's on the roadmap here, I've been running it separately.

FriederikeHanssen commented 2 years ago

carefully toying around with it, is the correct answer I would say.

nicorap commented 2 years ago

we have a few modules ready from the danish national genome center, I'll just a add them and make the missing ones also with a sub workflow.

cjfields commented 2 years ago

Small note: in the 2.7.2 workflow we found a bug which skips Sentieon DNAscope and DNAseq when using manually recalibrated BAMs, which seems to be due to a difference in how GATK and Sentieon steps are implemented. GATK HaplotypeCaller, Manta, etc) appear to use a recalibrated BAM file with index here.

However, the next line of code indicates Sentieon steps are using the deduped BAM + index + recalibration table, and the related processes require that table. So, when given a recalibrated BAM + index, this effectively skips Sentieon calling altogether. The input channel for those steps also isn't generated when given a table like this.

The reason I point this out: we can add a simple fix for this and test it, but it's worth noting that all others steps apart from Sentieon use an already recalibrated BAM file and the Sentieon BQSR generates this BAM file anyway, so I'm not sure of the benefit of having this oddly unique step just for Sentieon. Maybe something to consider for the DSL2 implementation?

nicorap commented 2 years ago

That makes sense, and also explain why I never got to have sentieon running using 2.7.2 from bam. 😐. I think it’s important to be able to skip BQSR, as for production setups the data should always be the same, so it’s not needed.

maxulysse commented 8 months ago

@asp8200 I feel we're getting close to finalize this issue

asp8200 commented 8 months ago

TNseq and TNscope still not implemented :-/

maxulysse commented 8 months ago

yeah I know, but we're getting closer and closer to completion

cjfields commented 8 months ago

@maxulysse you are awesome for taking this one, we're about to get this running for some of our projects.

EDIT: just to add, we're not using TNseq/TNscope for this one (it's all germline) but we can certainly try it out.

maxulysse commented 8 months ago

@asp8200 did all the work, he should get the praise. You should be able to test it, it's in the latest release already