vatlab / sos

SoS workflow system for daily data analysis
http://vatlab.github.io/sos-docs
BSD 3-Clause "New" or "Revised" License
274 stars 45 forks source link

Removing (corrupted) output files before the execution of steps. #1498

Open BoPeng opened 2 years ago

BoPeng commented 2 years ago

This looks like a reasonable feature, but there are a few problems.

First as described in #1496, nested workflows can cause problems:

[a]
output: something
sos_run('b')

[b]
output: something

When this happens, we actually execute step b before a,. Removing something before executing a will remove the output generated by b.

Second, it is unclear when to remove the files since the output files can be valid and need to be validated by signature. We will need to remove output files right before the statements are executed (after signature validation), then we have steps executor, substep executor etc, and task executor. This is doable but certainly not an easy task.