beelabhmc / flower_map

Automated flower species classification for generating honey-bee foraging maps
MIT License
4 stars 2 forks source link

Allow Snakemake to delete the stitched.files/ directory if the stitching step fails #12

Closed aryarm closed 4 years ago

aryarm commented 4 years ago

When a rule fails, Snakemake will automatically delete any of its residual output in an effort to ensure that downstream steps don't use the incomplete output accidentally. I believe Snakemake was doing this correctly for the Metashape project file (ie the stitched.psx file).

Unfortunately, Metashape creates another directory in the same folder as stitched.psx called stitched.files. Presumably this directory contains information relevant to the project file. But because I hadn't listed the stitched.files directory as an output of the stitching rule, Snakemake was unaware of its existence and wouldn't have deleted it automatically.

Consider the following scenario:

  1. We run Snakemake on a sample and it fails midway through the stitch.py script As a result, the stitched.psx file will get deleted. but the stitched.files directory will persist.
  2. We rerun Snakemake on the same sample to reproduce the error. The stitch.py script recognizes the existence of the stitched.files directory and tries to start back up where it left off.
  3. Something goes wrong while it is trying to start back up where it left off. This fails but for a different reason than before.

Of course, I don't know if step 3 above would actually happen. But it's certainly plausible. So what we want is for the stitch.py script to run everything from beginning to end automatically, just the same as it did in step 1 of the scenario. That's why this pr adds the stitched.files directory as a listed output of the stitching rule.