argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.85k stars 3.17k forks source link

Dependency between CronWorkflows #13097

Closed fstaudt closed 3 months ago

fstaudt commented 3 months ago

Summary

In a Argo Workflow DAG, it is possible to define dependencies between tasks of the DAG.\ It is however not possible to define dependencies between tasks defined in another DAG.\ In some cases (e.g. when workflows are triggered by cron with different schedule), it is not possible to merge the 2 DAG in a single DAG.

Such feature of cross-DAG dependencies is available in Apache Airflow and it would be great to have it in Argo Workflows.

We would like to have the ability to define dependencies between CronWorkflows in Argo CronWorkflow spec.\ Proposal is to add spec.dependencies to define dependency with other CronWorkflows:

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  name: cron-workflow-1
spec:
  schedule: 0 18 * * *     # scheduled at 6PM every day
  workflowSpec:
    workflowTemplateRef:
      name: workflow-template-1
---
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  name: cron-workflow-2
spec:
  schedule: 0 20 * * *     # scheduled at 8PM every day
  dependencies:
  - cronWorkflowRef: cron-workflow-1    # only executed if cron-workflow-1 
    executionDeltaSeconds: 7200         # was successfully executed in the last 2 hours
  workflowSpec:
    workflowTemplateRef:
      name: workflow-template-2

Use Cases

When CronWorkflows are configured with different schedule, we want to execute the second CronWorkflow only if first CronWorkflow was successfully executed in last X seconds.\ If condition is not met, we would like that a Workflow is created by CronWorklow and workflow should immediately fail.


Message from the maintainers:

Love this feature request? Give it a 👍. We prioritise the proposals with the most 👍.

Joibel commented 3 months ago

Your specific use case is possible if the second workflow has a pod at the beginning which has get,list access to workflows, and can list out the dependent workflow and succeed/fail at that point, avoiding any further work.

The possible permutations of a useful workflow interdependency check are pretty high, and I'm not sure we should have it as a core piece of functionality. We don't even have #12757.

If you'd like to contribute this as a workflow executor plugin, that would be very welcome.

Is there a reason in your case why cron-workflow-2 isn't just a workflow template triggered by cron-workflow-1?

agilgur5 commented 3 months ago

This was previously discussed in #12241 with a few different answers

Your specific use case is possible if the second workflow has a pod at the beginning which has get,list access to workflows, and can list out the dependent workflow and succeed/fail at that point, avoiding any further work.

This was my answer as well, although I preferred the other answer which actually queries tables and so is more resilient.

fstaudt commented 3 months ago

Your specific use case is possible if the second workflow has a pod at the beginning which has get,list access to workflows, and can list out the dependent workflow and succeed/fail at that point, avoiding any further work.

We had considered using a dedicated docker image used as the first task in a DAG to check status of the other CronWorkflow.\ Before going in that direction, we wanted to know if this feature could be supported by Argo out-of-the-box.

If you'd like to contribute this as a workflow executor plugin, that would be very welcome.

I'll definitively look into that.\ It may allow to define CronWorkflow dependency even if workflow is not a dag.

It's still unclear when the executor plugin is called and how it integrates in the lifecycle of a workflow.\ Documentation is not very clear on this point.\ I'll have to make some tests with the hello plugin to understand it better.

Is there a reason in your case why cron-workflow-2 isn't just a workflow template triggered by cron-workflow-1?

In this example, cron-workflow-2 can only be triggered at 8PM and not before.\ It it was triggered by success of cron-workflow-1, it could be triggered too soon.

fstaudt commented 3 months ago

@Joibel , @agilgur5 ,

CronWorkflow dependency has already been discussed in https://github.com/argoproj/argo-workflows/discussions/12241. I'll close this issue now.

Thanks for your answers.

agilgur5 commented 3 months ago

It's still unclear when the executor plugin is called and how it integrates in the lifecycle of a workflow. Documentation is not very clear on this point.

Yea executor plugin docs could use a lot of improvement. Very broadly speaking, a plugin receives an RPC with whatever you pass in your plugin execution spec. Effectively, it acts like a short-hand for a custom container image and formatting inputs to fit your container image, which allows users to share functionality via plugins.

There are certainly other details but they are more low-level.

Parisha7 commented 2 months ago

Hi @Joibel @agilgur5

As suggested above, we had looked into the workflow executor plugin and faced below issues:

We reviewed various plugin examples on the Argo Workflows Plugin Directory but didn't find any that fit our use case.

We are able to run the plugin independently, but we can't control the main container of this plugin. Despite trying various approaches, running the plugin and DAG together doesn't seem feasible to us.

As per our understanding, the plugin runs independently and performs tasks in parallel, accepting different inputs. However, it does not interact with other task templates. It can run before or after the DAG, but not alongside it.

We have attached the configMap (created via argo executor-plugin build) and workflow that we are using. Plugin.zip

Kindly let us know if we are missing something.

We could also create a new ticket if needed.

cc:: @fstaudt

Thanks, Parisha

Parisha7 commented 2 months ago

Hi @agilgur5

It would be really helpful if you could provide us with an ETA for this.

Thanks in advance.

agilgur5 commented 2 months ago

You are asking for support with your plugin and I am an unpaid volunteer maintainer, like much of OSS. That comes off as very presumptuous.

If you want an ETA for support, you should pay a vendor for an SLA and ask them for support.

Parisha7 commented 2 months ago

@agilgur5 Thank you for your reply and for your voluntary efforts in maintaining this plugin. I apologize for any misunderstanding. We didn't intend for you to implement our solution; rather, we were seeking suggestions or guidance to ensure if we were on the right track or not.

I understand the constraints of open-source contributions. Apologies if my request came off as presumptuous. Thanks again.

agilgur5 commented 2 months ago

rather, we were seeking suggestions or guidance to ensure if we were on the right track or not.

That's fine, but asking for an ETA for support is a step too far. That's exactly what paid SLAs are for. No OSS library has any such SLA. Even single vendor OSS has a term for lack of any SLA, "community support", which is exactly what this is.

Apologies if my request came off as presumptuous.

Your question was fine, your follow-up less than 2 days later (during a US holiday no less) asking for an ETA on a support request, directly from a volunteer maintainer no less (you literally @'d me), was not.