argoproj / argo-events

Event-driven Automation Framework for Kubernetes
https://argoproj.github.io/argo-events/
Apache License 2.0
2.33k stars 731 forks source link

Request To Integrate OpenTelemetry Trace For Published Events #1111

Open vineethgopinadhan opened 3 years ago

vineethgopinadhan commented 3 years ago

Summary

Integrate an open standard instrumentation into Argo Events for emitting metrics and traces, preferrably OpenTelemetry similar to the request (https://github.com/argoproj/argo-cd/issues/4972) for Argo-CD.

Reason

Being able to see traces of events that are published from event source will give a better visibility into all the events triggered with in the environment. Similar to the mentioned argo-cd request, we suggest to have a generic OpenTelemetry integration in argo events so we could leverage the Instana's OpenTelemetry support(https://www.instana.com/docs/ecosystem/opentelemetry/) and use instana as a collector. This probably gives opportunity for other APM tools that has interoperability with OpenTelemetry .

Request

aweis89 commented 2 years ago

Do event sources have data on event status though?

Was thinking it would be really nice if we had full tracing from event submission until trigger completion as well. But my understanding is that event sources are just aware of successful submission to NATS. Would we extend that to sensors as well to incorporate and link trigger executions? If we did, having that full trace would be super valuable and solve one of the common observability issues with async workflows. It's often the case that event submitters are detached from subscribers observationally due to their decoupled nature. Solving that issue could provide a substantial motivation for argo events adoption.

I'm not clear how this would work though in the case of the k8s trigger. I don't think the sensor is aware of execution status, merely the successful submission to the k8s API. Maybe we can add a path to the API that would watch a given field for specified values like "completed" or "failed" supplied by the user? Maybe something like:

successCondition:
  path: status.phase
  value: Succeeded
failedCondition:
  ...

This would seemingly require spinning up dynamic informors for the sensor to subscribe to events for whatever k8s resource it's submitting. Would be curious to hear your thoughts @whynowy on end-to-end tracing integration? This is potentially something my team would be interested in contributing to

elovelan commented 5 months ago

Given there is a CloudEvents Distributed Tracing extension, implementing this might not be too challenging, but would almost definitely require each traceable source and trigger to ensure the context is passed correctly. A custom trigger could be used to publish a span as well.

I'd be down for helping with this but I'm thinking it'll need some additional design first.

AllenZMC commented 4 months ago

Hopefully this feature will be available soon.