microsoft / ga4gh-tes

C# implementation of the GA4GH TES API; provides distributed batch task execution on Microsoft Azure
MIT License
32 stars 26 forks source link

Add TES idempotency feature #707

Open MattMcL4475 opened 3 months ago

MattMcL4475 commented 3 months ago
  1. Has the exact same set of Inputs (same Urls)
  2. Has the same exact values for Executors

If EnabledWithOutputCopying, then TES shall use Azure server-side blob copy to copy the previous task's outputs to the current task output's specified location(s). This work item should be added to an in-memory queue and the task state shall be set to RUNNING. It should be done in a non-blocking way from the main task status checking loop, so as not to slow down overall task throughput (Tasks can have thousands of files that need to be copied, and even though it's done server side, calling that API 1000 times will take a while). Before starting the copy, the task state shall be set to RUNNING. There shall be two separate C# HostedServices that are long-running (Created in startup.cs), one and periodically checking if all of the copies are complete; then set the task state to COMPLETE. The other should be checking if any blob copy on the file(s) is already in progress, and if not, start the copy. If TES crashes, it should be able to pickup where it left off by looping through all RUNNING tasks and resuming each one that is currently copying inputs.