Closed max-hence closed 3 months ago
Hi Max, yes this is possible using the temp()
function in Snakemake to mark output files as temporary. Doing this will instruct Snakemake to delete the marked file(s) once they are no longer required by workflow.
For example you can mark the deduped bams as temporary like so:
# snpArcher/workflow/rules/fastq2bam.smk
rule dedup:
input:
unpack(dedup_input)
output:
# dedupBam = "results/{refGenome}/bams/{sample}_final.bam",
# dedupBai = "results/{refGenome}/bams/{sample}_final.bam.bai",
dedupBam = temp("results/{refGenome}/bams/{sample}_final.bam"), # marked temp
dedupBai = temp("results/{refGenome}/bams/{sample}_final.bam.bai"), # marked temp
conda:
"../envs/sambamba.yml"
resources:
threads = resources['dedup']['threads'],
mem_mb = lambda wildcards, attempt: attempt * resources['dedup']['mem']
log:
"logs/{refGenome}/sambamba_dedup/{sample}.txt"
benchmark:
"benchmarks/{refGenome}/sambamba_dedup/{sample}.txt"
shell:
"sambamba markdup -t {threads} {input.bam} {output.dedupBam} 2> {log}"
You can apply this to any other intermediate output files you might not want to keep from the workflow.
Hi Cade, I'll try that, Thanks for the quick answer !
Dear snpArcher's developpers,
I have one question about your usefull software. I'm working on a small server where I can't store all the fastq at the same time. Because snpArcher seems to run mapping and snp_calling on each samples independently, I was wondering if it would be possible for me to add some lines to get rid of the fastq and bam files after the snp_calling, before merging all samples together ? Or will it breake everything ?
Thank you for your answer.
Max Brault