Open poddarharsh15 opened 4 months ago
Hey! The regions command is already implemented and can be managed by providing a bed file to --intervals
. This is then used for all relevant steps in the pipeline: https://github.com/nf-core/sarek/blob/b5b766d3b4ac89864f2fa07441cdc8844e70a79e/modules/nf-core/deepvariant/main.nf#L31
The haploid contigs, we can add. In the meantime, you could provide those via a custom config, see docs
Hi @FriederikeHanssen something like this does it make sense?
process {
withName: NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_DEEPVARIANT {
ext.args = "--haploid_contigs="chrX,chrY""
}
}
process {
withName: DEEPVARIANT {
ext.args = "--haploid_contigs="chrX,chrY""
}
inspired by:- https://github.com/google/deepvariant/blob/r1.6.1/docs/deepvariant-haploid-support.md
yes sorry missed your answer:
process {
withName: NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_DEEPVARIANT {
ext.args = "--haploid_contigs="chrX,chrY""
}
}
looks right.
Just a note: ext.args
is not additive. So if there are other arguments you want to take with from conf/deepvariant.config
you will need to add those in.
Hi @FriederikeHanssen
Thank you for your response. Unfortunately, these parameters won’t work because the DeepVariant version that Sarek is using does not recognize the --haploid_contigs
parameter. I tried running and updating to Version 1.6.0 and the module locally on my cluster, and while I was able to detect some variations in chrX, I could not detect any in chrY. I will add the link to the Slack discussion where another developer was assisting me for your reference.
Here
Hi @asp8200 thanks for adding the link I have missed it because I was on my phone
Description of feature
Description: I am writing to request the addition of parameters for specifying haploid contigs and regions when detecting SNPs and Indels using DeepVariant in the nf-core/sarek pipeline. These parameters are essential for our benchmarking and analysis using GIAB data.
Proposed Parameters:
Use Case: Including these parameters will allow users to define specific regions and haploid contigs for their analysis, improving the flexibility and accuracy of the SNP and Indel detection process.
Example Usage:
This example demonstrates how users can specify the haploid contigs and regions in the params.json file.
Benefits: Enhanced control over the genomic regions being analyzed. Improved accuracy for SNP and Indel detection, especially in specialized cases like haploid genomes. Thank you for considering this request. Your assistance in improving the nf-core/sarek pipeline is greatly appreciated. @maxulysse
Docs for help 1.
2.
Best regards, Harsh Poddar