reanahub / reana-workflow-engine-snakemake

REANA Workflow Engine Snakemake
MIT License
0 stars 21 forks source link

improper progress indication for cached steps #62

Closed giuseppe-steduto closed 10 months ago

giuseppe-steduto commented 10 months ago

Snakemake has extensive out-of-the-box in-workflow caching capabilities: that is, it saves the results of intermediary steps in a cache directory, so that they can be reused by the same workflow if the inputs, code and parameters did not change.

In REANA, Snakemake's caching capabilities are natively supported if the workflow is re-run in the same workspace (e.g. with the reana-client restart command). If the results of one or more steps can be taken by Snakemake from its cache, REANA won't execute them. This is a nice and desired feature that could save much time to researchers when polishing their workflows.

However, there is a small problem when reporting the progress of a workflow whose results were partially or totally recovered from the cache: REANA does not consider the jobs that were cached as executed, even if the workflow is correctly marked as finished. This leads to some inconsistencies in how the workflow progress is displayed: image image

❯ reana-client list --include-progress
NAME               RUN_NUMBER   CREATED               STARTED               ENDED                 STATUS     PROGRESS
roofit-snakemake   2.1          2023-08-29T12:55:39   2023-08-29T12:55:50   2023-08-29T12:56:01   finished   -/-

We should fix this either by marking the cached steps in the same way as the finished ones, or by somehow reporting that some or all the steps were restored from cache.