franciscozorrilla / metaGEM

:gem: An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
https://franciscozorrilla.github.io/metaGEM/
MIT License
203 stars 42 forks source link

feat: Check if files/folders exist in tmpdir before copying them within Snakefile rules #54

Closed franciscozorrilla closed 2 years ago

franciscozorrilla commented 3 years ago

This would help make it easier to continue/restart failed or incomplete jobs.

To check if file exists:

FILE=/path/to/file.txt
if test -f "$FILE"; then
    echo "$FILE exists."
fi

To check if folder exists:

FILE=/path/to/folder
if [ -d "$FILE" ]; then
    echo "$FILE is a directory."
fi

For example, a large binReassemble job timed out after the 24 hr limit on my cluster. To continue the job without re-recruiting reads & re-reassembling some genomes I had to manually silence the cp command in the Snakefile rule. This could be handled automatically by a conditional statement as shown above, e.g. if tmp/$job/$sample exists then do not copy any new files into it.