NBISweden / GenErode

GitHub repository for GenErode, a Snakemake pipeline for the analysis of whole-genome sequencing data from historical and modern samples to study patterns of genome erosion.
GNU General Public License v3.0
21 stars 7 forks source link

Implement shadow rules for some rules (e.g. repeatmodeler) #3

Open verku opened 2 years ago

verku commented 2 years ago

Shadow rules result in each execution of the rule to be run in isolated temporary directories. This “shadow” directory contains symlinks to files and directories in the current workdir. This is useful for running programs that generate lots of unused files which you don’t want to manually cleanup in your snakemake workflow. It can also be useful if you want to keep your workdir clean while the program executes, or simplify your workflow by not having to worry about unique filenames for all outputs of all rules. shadow: "minimal" symlinks the inputs to the rule. Once the rule successfully executes, the output file will be moved if necessary to the real path as indicated by output. Shadow directories are stored one per rule execution in .snakemake/shadow/, and are cleared on successful execution. Consider running with the --cleanup-shadow argument every now and then to remove any remaining shadow directories from aborted jobs. The base shadow directory can be changed with the --shadow-prefix command line argument.

verku commented 1 year ago

Check if behaviour is desired on HPC cluster

verku commented 9 months ago

The mapping rule(s) where bam files are sorted create temporary files in the main Snakemake workflow directory that remain if the job fails. This could also be solved by using shadow rules.