Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
43 stars 17 forks source link

[Bug] Fix tmp-dirs in some leaky rules #1446

Open mathiasbio opened 3 weeks ago

mathiasbio commented 3 weeks ago

Description

At the moment there's a bit of a mix in the use of tmp dirs in the rules in our snakemake workflows. Some rules have this in params: tmpdir = tempfile.mkdtemp(prefix=tmp_dir) And this in the command:

mkdir -p {params.tmpdir};
export TMPDIR={params.tmpdir};

Other rules don't have anything regarding tmpdirs, such as:

rule cadd_annotate_somaticINDEL_research:
  input:
    vcf_indel_research = vcf_dir + "SNV.somatic.{case_name}.{var_caller}.indel.research.vcf.gz",
  output:
    cadd_indel_research = vep_dir + "SNV.somatic.{case_name}.{var_caller}.cadd_indel.research.tsv.gz",
  benchmark:
    Path(benchmark_dir, "vep_somatic_research_snv.{case_name}.{var_caller}.tsv").as_posix()
  singularity:
    Path(singularity_image, config["bioinfo_tools"].get("cadd") + ".sif").as_posix()
  params:
    message_text = "SNV.somatic.{case_name}.{var_caller}.research.vcf.gz",
  threads:
    get_threads(cluster_config, "cadd_annotate_somaticINDEL_research")
  message:
    "Running cadd annotation for INDELs on {params.message_text}"
  shell:
        """
CADD.sh -g GRCh37 -o {output.cadd_indel_research} {input.vcf_indel_research}
        """

Which failed in production due to too full tmp on the worknode compute-0-27 (which was close to full due to a leaky rule in my development branch which saved a bunch of inprogress bamfile-chunks in there)

We should try to find the correct way to assign tmpdirs and make this consistent across all of our rules

How to reproduce

No response

Expected behaviour

No response

Anything else?

No response

Pipeline version

15.0.0

mathiasbio commented 3 weeks ago

From Eva 2024-06-14: Another leaky rule seems to be picard_umiaware:

OpenJDK 64-Bit Server VM warning: Insufficient space for shared memory file:
   75256
Try using the -Djava.io.tmpdir= option to select an alternate temp location.

Alright, last one. Most failed jobs are either due to picard_umiaware or to cadd_annotate. But there is also one failing job (also in comp 27) failing on BALSAMIC.bettercrappie.cnvkit_segment_CNV_research.142.sh_6710424.err With similar errors, so this might also be another rule to look at:

RuntimeError: Subprocess command failed:
$ Rscript --no-restore --no-environ /var/tmp/tmpdqpd9oxy

b"Fatal error: cannot create 'R_TempDir'\n"