argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.76k stars 3.16k forks source link

Watch events from external database, e.g. ElasticSearch or ClickHouse #12448

Open umialpha opened 7 months ago

umialpha commented 7 months ago

Summary

Watch Events from external database besides k8s, e.g. ElasticSearch or ClickHouse.

Usually, k8s doesn't store events too long. In production, we sometimes export events to external database for more use cases. There are some popular exporter to do this. For example, https://github.com/AliyunContainerService/kube-eventer https://github.com/resmoio/kubernetes-event-exporter

There are at least two cases why we need this features:

  1. For failed but not archived workflow, events sometimes expire and it is hard to track the failed reason.
  2. For failed and archived workflow, it is even harder to track the events.

Use Cases

When would you use this?

To track the failure reason for workflows, especially for archived workflows.


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.

umialpha commented 7 months ago

Actually, we implement this feature on our private fork branch, but we don't think it is general approach. i wonder whether the community consider this feature.

tooptoop4 commented 7 months ago

@umialpha interested in this, put some notes on https://github.com/argoproj/argo-events/issues/1827

terrytangyuan commented 7 months ago

Why not exporting events once a workflow has completed?

agilgur5 commented 7 months ago

Watch Events from external database besides k8s, e.g. ElasticSearch or ClickHouse.

This sounds like a feature request for Argo Events, not Workflows?

umialpha commented 7 months ago

sry for confusion. I think it is a feature request of argoworkflow. The events are the kubernetes events.

image

Currenty, argoworkflow list-watch k8s events from k8s source. But usually k8s events lifecycle is short. In production, k8s events usually are sinked into external database like clickhouse or es.

So, is there any plan to add support to list k8s-events from other databases?

agilgur5 commented 7 months ago

Oh you mean in the UI specifically for historical reasons. Yes currently the UI provides a list watch on k8s events for Workflows to help with debugging etc. As the k8s events you're referring to are historical for completed Workflows, it would just be a list, no need to watch in that case.

That is currently k8s native (which is ideal), and as far as I know, there is no standard historical k8s events database or standardized API. At my last job, we had historical k8s events in New Relic. That would make it very difficult to integrate as each integration would need its own implementation, connectors, dependencies, etc. I would say that this would potentially be a good use-case for plugins a la #6943. I don't think it would be wise for Argo to support this out-of-the-box as it would be a hefty maintenance burden for a small number of users.

We do already recommend using standard k8s tooling for historical observability, such as for logs, so this would fit into that pattern. Although integrating with existing tooling may be a better paradigm than building an Argo-specific implementation like the archiveLogs feature. An archiveEvents feature could provide similar support via an Artifact Repository without too much legwork, but that would also be a not recommended hack and would not provide integration with other events stores as you describe.