Open niyati1211 opened 2 months ago
hi, I suspect the difference is in how your pipeline is calling mosdepth. Can you show the actual arguments sent to mosdepth for each run?
Thank you for the quick response, brent!
So I believe this is how the pipeline calls mosdepth (https://github.com/nf-core/sarek/blob/f034b737630972e90aeae851e236f9d4292b9a4f/conf/modules/modules.config#L54). I am not sure if the arguments sent to mosdepth were different between both runs since I was using different versions of the pipeline (the older version of the pipeline using 0.3.3 and new version uses 0.3.8). I will have to check with the nf-core/sarek support team.
withName: 'MOSDEPTH' {
ext.args = { !params.wes ? "-n --fast-mode --by 500" : ""}
ext.prefix = {
if (params.tools && params.tools.split(',').contains('sentieon_dedup')) {
"${meta.id}.dedup"
} else if (params.skip_tools && params.skip_tools.split(',').contains('markduplicates')) {
"${meta.id}.sorted"
} else {
"${meta.id}.md"
}
}
ext.when = { !(params.skip_tools && params.skip_tools.split(',').contains('mosdepth')) }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/mosdepth/${meta.id}" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}
if ((params.step == 'mapping' || params.step == 'markduplicates'|| params.step == 'prepare_recalibration'|| params.step == 'recalibrate') && (!(params.skip_tools && params.skip_tools.split(',').contains('baserecalibrator')))) {
withName: 'NFCORE_SAREK:SAREK:CRAM_QC_RECAL:MOSDEPTH' {
ext.prefix = { "${meta.id}.recal" }
}
withName: 'NFCORE_SAREK:SAREK:CRAM_QC_RECAL:SAMTOOLS_STATS' {
ext.prefix = { "${meta.id}.recal.cram" }
ext.when = { !(params.skip_tools && params.skip_tools.split(',').contains('samtools')) }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/samtools/${meta.id}" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
}
I ran mosdepth twice on the same WES samples and got different coverage values. I used the same exome intervals bed file and reference genome build (hg38). There version of mosdepth used was different between both runs: 0.3.3 vs 0.3.8. Mosdepth was run within the nf-core/Sarek pipeline.
For example, in one of the samples, the median coverage is 69x (0.3.3) vs 100x (0.3.8). I am confused as to why there is such a large difference. Is this expected? Any insight will be appreciated!