jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.52k stars 2.44k forks source link

Renovate Streaming Support #5910

Open yurishkuro opened 2 months ago

yurishkuro commented 2 months ago

Summary

Bring streaming analytics support directly into Jaeger backend, instead of requiring separate Spark/Flink data pipelines.

Background

One of the challenges of distributed tracing is that spans can arrive from all kinds of places in the architecture at different times. If your only job is to store them (which is what Jaeger collector does primarily) then it's not a big problem, since the storage backends take care of partitioning and indexing the spans by trace-id. But the most interesting applications of traces require looking at a whole trace in one place to make decisions based on the overall call graph, not on individual spans.

Data Streaming is great at doing that. Historically Jaeger supported a couple of Java-based data pipelines (for basic dependency graph and for transitive dependency graph), which were implemented independently on top of Spark and Flink frameworks. There were problems with that approach:

Proposal

We should bring streaming capabilities into the main Jaeger repo using Go code. This will address many of the problems mentioned above. The main challenge with data streaming is that it is a stateful activity, which requires checkpointing capabilities to avoid data loss and inconsistent results when Jaeger instances are restarted. This is where the well known streaming frameworks like Spark and Flink come in - they provide the needed orchestration and statefulness. In the past we could not use them with Go, but today there are projects like Apache Beam that provide a unified programming model via well supported SDK (including Go) that allows implementing the pipeline logic in Go and executing it on a number of runtimes

image