chanzuckerberg / miniwdl

Workflow Description Language developer tools & local runner
MIT License
173 stars 54 forks source link

Keep going with independent tasks #570

Open nh13 opened 2 years ago

nh13 commented 2 years ago

When a WDL task fails (run via AGC), I've observed that all other tasks are killed. I'd like miniwdl to execute as many tasks as possible until there are no more tasks that are independent of the failure(s). See snakemake's --keep-going option and Nextflow's errorStrategy. This would allow me to complete as many tasks as possible, so that the next time I re-run the workflow while caching tasks, I'll be closer both to completion as well as having the task that failed start sooner (to see if it works this time).

mlin commented 2 years ago

There's a config option [scheduler] fail_fast = false / env MINIWDL__SCHEDULER__FAIL_FAST=false that should do this. I don't think we have a test case for it with AWS specifically, but that scheduler logic is "above" the container backend. Let me know if it doesn't work.

nh13 commented 2 years ago

Where’s a good place to contribute documentation for my future self about these config options?

mlin commented 2 years ago

That default.cfg is commented extensively, but the docs on configuration do a really mediocre job of linking out to it -- that's probably the low-hanging fruit

mlin commented 2 years ago

Another likely problem is that AGC doesn't yet make it super convenient to set the more-advanced config options (that don't have dedicated command-line arguments). https://github.com/aws/amazon-genomics-cli/pull/420 would help with that dankly.

Until that's available I think you'd have to

  1. copy the AWS-specific cfg file which gets baked into the docker image AGC uses for miniwdl
  2. add desired overrides to your copy of the file
  3. include it in the workflow source directory (so that it gets bundled up into the zip file that AGC sends into the context)
  4. set {"engineOptions": "--cfg path/to/custom.cfg"} in the MANIFEST.json
nh13 commented 2 years ago

Thanks @mlin this is super helpful and thank you for your patience. I’m really enjoying using miniwdl so hopefully these questions are not seen as a criticism but a desire to understand, and contribute back in some small part.