Closed lored322 closed 5 months ago
Thank you for reporting this issue! We will include one of the solutions in the next pipeline version.
If you can't wait until then, you can solve this by removing the temp()
flag yourself from the following lines of code:
This line turns into:
index="results/{dataset}/mapping/" + REF_NAME + "/{sample}.merged.rmdup.merged.realn.bam.bai",
And this line turns into:
index="results/{dataset}/mapping/" + REF_NAME + "/{sample}.merged.rmdup.merged.{processed}.mapped_q30.subs_dp{DP}.bam.bai",
Rule 7_mlRho looks for .bam.bai files as input even though it only uses the .bam files. However in both 3.1 and 3.3 bam processing steps, these bam.bai files are marked as temporary and deleted at end of the pipeline run. Thus, if any files are missing from the expected output of 3.1 (i.e. sorted bams), then the pipeline will remap all affected samples from the beginning.
This is different to the 4_genotyping rule, which is only dependant of the final bam file from 3.1/3.2/3.3, and thus the absence of the bam.bai file will not trigger a remapping of any samples.
One option would be to make the bam.bai files not temporary, or otherwise remove the code calling them in 7_mlRho.smk (which works).