Open mio4kon opened 10 months ago
I also want the argo-cli support Skip operation, which support skipping the specific failed nodes and modifying the outputs.
Retrying specific nodes is already possible, see #12005 for an example
I alse [sic] want the argo-cli support Skip operation, which support skipping the specific failed nodes and modifying the outputs.
Please keep feature requests on-topic to 1 per issue.
The behavior you're asking for would not be supported though, nodes are only considered skipped if they have a conditional that skips them. A "skip" operation also doesn't exist in other DAG orchestrators as far as I know either
Retrying specific nodes is already possible, see #12005 for an example
There is a slight difference compared to what was mentioned earlier.
#12005 the retry of specific nodes here only supports specifying successful nodes, and it's not possible to specify failed nodes. --node-field-selector
and --restart-successful
must be used together.
I alse want the argo-cli support Skip operation, which support skipping the specific failed nodes and modifying the outputs.
Please keep feature requests on-topic to 1 per issue.
The behavior you're asking for would not be supported though, nodes are only considered skipped if they have a conditional that skips them. A "skip" operation also doesn't exist in other DAG orchestrators as far as I know either
@jswxstw The issue can be partially resolved by using [ A || A.failed ]
, but it is overly automated. @agilgur5 In some scenarios, after a failure, manual confirmation may be needed to proceed with the subsequent steps. However, currently, it is unclear how to implement this workflow.
You have set continueOn Failed
to the B and C nodes, so you do not want to retry these steps when manual retrying, am I understanding correctly?
You have set
continueOn Failed
to the B and C nodes, so you do not want to retry these steps when manual retrying, am I understanding correctly?
No, what I replied to is not the issue I raised. Instead, it's about the suggestion you mentioned to add a button for skipping failures. I believe the suggestion you made is also very useful for me, as we encounter similar scenarios on our end.
#12005 the retry of specific nodes here only supports specifying successful nodes, and it's not possible to specify failed nodes.
The default behavior of retry
is to only retry failed nodes.
--node-field-selector and --restart-successful must be used together.
If you need to use them together, you can.
There is a slight difference compared to what was mentioned earlier.
Sorry it's not clear to me what the difference is. Retrying specific nodes is possible (whether succeeded or failed). Is there something else missing that you would like?
@jswxstw The issue can be partially resolved by using [ A || A.failed ], but it is overly automated.
Yes, this would be one of way of implementing it.
about the suggestion you mentioned to add a button for skipping failures
Skipping a step is a semantically different Workflow.
The currently available operations do not change the intent or conditionals of the Workflow, and they should not. Operations only modify the Workflow's status
, they do not modify the Workflow spec itself, and should not do so.
So a "Skip" operation as proposed violates semantic intent and would not be accepted as a feature as such.
@agilgur5 In some scenarios, after a failure, manual confirmation may be needed to proceed with the subsequent steps. However, currently, it is unclear how to implement this workflow.
That sounds like a 3rd, separate question from the other two; I'm not sure how this is related to "skipping". There's really too many topics here...
You can use a suspend
template to require manual confirmation, so that too is already possible.
#12005 the retry of specific nodes here only supports specifying successful nodes, and it's not possible to specify failed nodes.
The default behavior of
retry
is to only retry failed nodes.--node-field-selector and --restart-successful must be used together.
If you need to use them together, you can.
There is a slight difference compared to what was mentioned earlier.
Sorry it's not clear to me what the difference is. Retrying specific nodes is possible (whether succeeded or failed). Is there something else missing that you would like?
The pipeline in the image above does not currently allow for selective retries of specific nodes, such as the BB node.
Using--node-field-selector
alone does not actually take effect; it still retries all failed nodes.
@agilgur5 Please help me review this PR 🙏 https://github.com/argoproj/argo-workflows/pull/12553
Using
--node-field-selector
alone does not actually take effect; it still retries all failed nodes.
Yea that sounds like a bug, afaik, --node-field-selector
is supposed to be able to be used without --restart-successful
. Strange that it hasn't been noticed earlier though, I wonder if there was a regression 🤔
This issue was filed as a feature request though, a reproducible Workflow and set of commands / instructions would be helpful to test with.
@agilgur5 Please help me review this PR 🙏 #12553
I'll take a look, thanks for checking the code
@agilgur5 Please help me review this PR 🙏 #12553
I'll take a look, thanks for checking the code
hi,Is there still a problem with the PR corresponding to this issue? Can you reopen this issue? I think it's still a bit of a hassle to not be able to retry a single failed node😭
hi,Is there still a problem with the PR corresponding to this issue?
Yes. You can use the "request a review" function on GitHub when you've made iterations.
Also, please do not expect immediate responses from open source maintainers, who are largely volunteers.
Can you reopen this issue? I think it's still a bit of a hassle to not be able to retry a single failed node😭
Sure, but as I wrote above, the issue is written as a feature request, not a reproducible bug report, which is confusing and missing information as a result.
hi,Is there any follow-up plan for fixing this issue? @agilgur5
https://github.com/argoproj/argo-workflows/pull/13734 retry rewrite here
Suggested Enhancement
Allow users to selectively retry specific failed nodes instead of retrying all failed nodes at once.
Use Cases
I'm using Argo Workflow, and at times, I would like the option to retry a specific failed node, instead of retrying all failed nodes (similar to GitLab CI's capability). Even if the overall pipeline still ends up failing, there are specific tasks that I'd prefer to retry without consuming excessive resources retrying other nodes that may inevitably fail. I believe providing users with this level of flexibility is important.
For example, in the following pipeline, I might prefer to only rerun the failed nodes of BB, rather than retrying both B and C nodes.
--- updates by agilgur5 below---
--node-field-selector
already exists and can be used to specify nodes