Closed matthdsm closed 2 months ago
@matthdsm , would this trigger a nomad-native
retry or is this expecting Nextflow to trigger this?
As of now, we have a model of 1 nextflow task -> 1 nomad job -> group -> task
and therefore, error at nomad job level reflects error at nextflow level and the overall behaviour aligns with what Nextflow would expect.
How nomad-native
retries would interact with the overall setup needs to be tested.
If I'm not wrong, before to have this constants we're using the default value (3) and Nextflow wait correctly for the completion of a failed job, so maybe can be a good idea to use 1 as default
If I'm not wrong, before to have this constants we're using the default value (3) and Nextflow wait correctly for the completion of a failed job, so maybe can be a good idea to use 1 as default
It seems that 3 is the default for these attempts https://github.com/nextflow-io/nf-nomad/pull/82#issuecomment-2315371597
@jagedn , do you think its worth exposing the closure and including other fields like delay
, interval
etc?
Okay, merging this and tagging this realease as 0.2.0-edge3
to release a build
@jagedn , do you think its worth exposing the closure and including other fields like
delay
,interval
etc?
maybe we can implement all of them when required
This should fix a concurrency issue with the CSI driver cfr https://github.com/ceph/ceph-csi/issues/3511 and https://github.com/hashicorp/nomad/issues/15197