wustl-oncology / cloud-workflows

Infrastructure and tooling required to get genomic workflows running in the cloud
1 stars 6 forks source link

cloudize-workflow.py - Script should start from last-upload if it fails and re-runs #3

Open johnmaruska opened 3 years ago

johnmaruska commented 3 years ago

In the event of script failure, subsequent runs will start from the beginning and double-upload any progress that had already been made. The resource consumption is probably not a concern but the time lost to a lengthy upload failure may be.

The script should have some approach in place for handling restarts. Either tracking progress and skipping any already-uploaded files, or choosing to skip files that already exist in the bucket. Both approaches could take a flag to ignore them and do a full upload again.