Open terrytangyuan opened 4 years ago
Opening this issue to discuss the idea before diving into the implementation. I'd like to take a stab at implementing this if others agree this is a good enhancement.
Interesting suggestion, but I see a couple of problems with the current proposal.
First, the implementation of DAG execution logic is heavily dependent on having clearly-defined dependencies between tasks. This proposal β by its own definition β would not have well-defined dependencies between tasks which would mean that at the very least a large refactor of this core code would be needed, which is something we should approach with caution.
Second, any implementation would have to carefully consider the impotency requirement of the operator. Again, the lack of a clearly defined dependency graph would mean that the execution order of the graph would be determined arbitrarily at runtime. This stochasticity could introduce ambiguity to the DAG execution logic and would require a well-defined decision mechanism to solve. I'm not saying it's not possible, but it is a significant hurdle that needs to be cleared.
Some comments to your points:
There are situations where the tasks are dynamically generated and task names cannot be easily obtained.
This is mostly a non-issue and is largely by design. While it is true that some task names are dynamically generated β mainly expansion tasks when using withParams
, withItems
, etc β we currently do not support users specifying dependency on those tasks, but only to their parent (i.e. generator) tasks. I don't see why we should for the same problem that you tried to highlight and the points I made above: we need to have a well-defined DAG and we cannot have one if we don't know the names of the tasks before execution.
A better solution to this problem would be to introduce a new Argo Variable that keeps track of the number of succeeded tasks in a DAG. It could be then used to make execution control within when
expressions in subsequent tasks or templates.
does this related to trigger rule in the AirFlow, I know the trigger would be easy to used to decide the runtime of DAG.
@merlintang Yes that is part of my original motivation.
@simster7 Thank you for your thoughtful response! I agree that if we take the large refactoring and the impotency requirement into account, this may not be a good approach to pursue.
Controlling this in when
expressions sounds like a good idea. Do you think a separate proposal is required or should we continue our discussion here if we want to take that approach? I am not familiar with that part of the code yet but will look into more in detail. Any pointers would be helpful as well.
@terrytangyuan We can continue our discussion here. Here is how I envision the spec:
- name: echo-thrice
steps:
- - name: expand-task
template: expand-task
arguments:
parameters:
- name: message
value: "{{item}}"
withItems: [1,2,3,4,5,6,7,8]
- - name: may-run
template: may-run
when: "{{steps.expand-task.subTasks.Suceeded}} > 3"
A better name and syntax for {{steps.expand-task.subTasks.Suceeded}}
is needed. Any suggestions?
You can look at https://github.com/argoproj/argo/pull/2911 for an example on how to make this change.
Looking forward to your PR
TBH, I think adding this feature to Argo's core language would overcomplicate it even more and make it more fragile.
Maybe each of your tasks should put record it's completion in some storage/db and the service that watches/manages that storage/db can trigger the second pipeline stage once the dependencies are satisfied.
I think adding this feature to Argo's core language would overcomplicate it even more and make it more fragile.
By "this feature" do you mean the original proposal (i.e. depends: "Succeeded > 3"
) or the Argo Variable approach (i.e. when: "{{steps.expand-task.subTasks.Suceeded}} > 3"
)?
If you mean the Argo Variable approach, could you please share your thoughts on why you think it would overcomplicate it/make it fragile?
Thanks @Ark-kun
@Ark-kun Could you elaborate a bit? IMO this should be something offered and handled by Argo. Putting the logic in DB/storage is not non-trivial in large systems and will likely introduce bugs/overheads.
@terrytangyuan Are you still working on this?
@simster7 Thanks for checking. One concern with your suggested approach is that we could potentially introduce a new set of variables for various phases since we are not just dealing with successes.
I would like to gauge interest in this proposal. I have posted my general question here looking for this. https://github.com/argoproj/argo-workflows/discussions/7722#discussioncomment-2098901
Hello Jas and team, I agree with Jas (#7722). We will be building dags with 700+ nodes completely as dags, where fan-in and fan-out occurs multiple time in the dag. using ref count approach could definitely solve that.
Hi, @terrytangyuan, I'd like to apply for this issue for the gsoc. I have sent you an email and look forward to your reply :)
Hi, @terrytangyuan, I'd like to apply for this issue for the gsoc. I have sent you an email and look forward to your reply :)
Thanks for reaching out but please read this document carefully: https://github.com/argoproj/argo-workflows/blob/master/docs/mentoring.md#how-to-participate-google-summer-of-code
Hi, @terrytangyuan, I'd like to apply for this issue for the gsoc. I have sent you an email and look forward to your reply :)
Thanks for reaching out but please read this document carefully: https://github.com/argoproj/argo-workflows/blob/master/docs/mentoring.md#how-to-participate-google-summer-of-code
Thanks! I will read it.
Summary
Support count-based enhanced depends logic in addition to existing depends logic based on dependent tasks and their statuses as outlined in the Enhanced Depends Logic document that has been introduced after v2.9.
Motivation
Currently, there exists a new called
depends
that allows users to specify dependent tasks, their statuses, as well as any complex boolean logic. The field is astring
field and the syntax is expression-like with operands having the form<task-name>.<task-result>
. Examples includetask-1.Suceeded
,task-2.Failed
, andtask-3.Damenoed
. More details on this existing functionality can be found in the Enhanced Depends Logic document.However, this requires users to specify task names explicitly which may be difficult or unnecessary to obtain, for example:
For example, some users might want to trigger the downstream task when N out of M dependent tasks have succeeded statuses. They may use Argo for use cases where specific statuses of the dependent tasks are not important.
Proposal
Extend the current
depends
field to be able to take complex comparison logic with operands having the form<task-result>
without task names.For example, the following
depends
string indicates that we only execute downstream tasks when at least 3 tasks succeeded and no task failed:Another example is to have a built-in
N
in the implementation that represents the total number of dependent tasks so users do not need to know the total number of dependent tasks beforehand.A simple case is only executing downstream tasks when all dependent tasks succeeded:
Users should be able to do simple calculations like the following:
The full list of available task results and is as follows:
Succeeded
Failed
Errored
Skipped
Completed
Daemoned
Full boolean logic is available:
&&
||
!
The full list of available math operators that we plan to support initially include:
+
-
*
/
Compatibility
Because of the added control in
depends
is based on count.dag.task.continueOn
is not available when using it. It is not possible to use bothdependencies
anddepends
in the same task group. Furthermore, users should only be able to use one of the two types of enhanced depends logic: count-based or task-name-based.Message from the maintainers:
If you wish to see this enhancement implemented please add a π reaction to this issue! We often sort issues this way to know what to prioritize.