Closed swiftdiaries closed 3 months ago
@swiftdiaries - yes, this feature is on the roadmap. Let's collaborate on the design.
/assign @vicaire
Awesome ! Looking forward to this :)
I will follow up on this thread as soon as we start tackling this. Thanks.
@swiftdiaries
It's a bit short but I provided an outline of how we plan to support event-driven pipelines here: https://docs.google.com/document/d/1O5n02SzMYmLH0cMkykxHWWWe7eMzaP1vk7Y3fBbLoD8/edit#heading=h.mhe3tnle0c9o
(See event-driven pipelines and data-driven pipelines)
In a nutshell:
WDYT?
Sorry for the late reply.
The overall idea is sound. I found this thread on kubeflow-discuss quite interesting on how Argo Events is integrated with Argo Workflow at GitHub.
Also, what is the status for this? If there are tasks to be done, happy to work together on this one
@swiftdiaries,
The metadatastore is currently being designed with collaboration from the KF community.
We could start by looking at the best way to integrate Argo events with KFP for common use cases. Adding the "help wanted" flag. Contributions/Proposals are welcome.
Note, resolving this issue should enable support for continuous online learning, as requested in https://github.com/kubeflow/pipelines/issues/1053
Do we need to make it specifc to Argo events? Can it be designed in generic way to support something like KNative eventing? @vicaire please include us if there are any backdoor design discussions going at this end
@jingzhang36 Is this feature being actively worked on?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
Any updates on this feature?
/reopen looks like someone cares
no one is working on this.
I am curious what makes it different from using KFP SDK triggered by the event
@Bobgy: Reopened this issue.
+1 on this as an issue. When data lands on a specific volume, an event should be trigger. Should this logic live in KFP?
Secondly, when an event is created, we would need a listener service to trigger the corresponding KFP pipeline. Is this sufficient?
@swiftdiaries
It's a bit short but I provided an outline of how we plan to support event-driven pipelines here: https://docs.google.com/document/d/1O5n02SzMYmLH0cMkykxHWWWe7eMzaP1vk7Y3fBbLoD8/edit#heading=h.mhe3tnle0c9o
(See event-driven pipelines and data-driven pipelines)
In a nutshell:
- We will have a metadata store storing info about the data generated by a workflow (metadata).
- Events can also be stored in that metadata stored from various sources (webhook, pub/sub, etc.) using piece of infrastructure decoupled from the rest of the system.
- An event-driven CRD will let users specify a workflow to execute each time new data of a particular type is added to the metadata store.
WDYT?
+1 on this, would like to see both the event trigger and data trigger configuration make it to KFP. Is Argo events the only solution here or should we use something more generic to Kubeflow?
+1 for this issue.
We would like to be able trigger pipeline runs from GCP pubsub events
@imagr-pat for GCP pubsub events, it's possible to add a cloud function that listens to it and runs a kfp client, does it work for you?
plus 1 for me on this issue as well.
Ideally I would like to see native Kafka support for event based triggering of Kubeflow pipelines. This way we don't have to use something outside like Nifi or Airflow to have to trigger pipelines based upon an event. This is all to ensure there is better native support for online learning which is event driven based upon the mini-batches of training data that constantly flow into the pipelines to re-train and re-deploy a model.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
hold
Look forward to seeing this feature so we don't need AWS lambda or Cloud Function to chain relevant pipelines ~~ A big thank you ~~
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Our team would like to integrate with SQS queue. The use case is the following. We would have data pipeline on airflow and ml pipeline on kubeflow. The integration would allow to run the ml pipeline once the data pipeline is completed.
+1 on this issue. This issue could solve the CD approach partially too. We could have an argo workflow which could do CD for us. This workflow could be triggered using GitHub webhook.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@chensun is this has been designed? or you guys are open to a community design proposal?
Any news on this?
I am looking for event/data driven pipelines that get triggered when new data arrives
+1
+1
+1
Any update on this? we’re also quite interested in this! Current workaround would be to use an AWS lambda function or Google Cloud Function like described here https://amygdala.github.io/gcp_blog/ml/kfp/mlops/tfdv/gcf/2021/02/26/kfp_tfdv_event_triggered.html#event-triggered-pipeline-runs that simply executes kfp.Client().run_pipeline()
another request for updates on this issue!
+1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
It'd be great if we could trigger pipelines automatically wrt events. Use Case 1: When a model is uploaded to an object store -> trigger a step (pipeline) to deploy. Use Case 2: When data arrives at a local volume / external storage -> trigger a pipeline to train.
This is related to https://github.com/kubeflow/pipelines/issues/604.
I'd love to see this feature and help out in the implementation with some PRs as well (if it's on the roadmap)