jaegertracing / jaeger-analytics-java

Data analytics pipeline and models for tracing data
Apache License 2.0
45 stars 24 forks source link

Build Status

Jaeger Analytics

Experimental repository with data analytics models, pipelines for Jaeger tracing data.

Table of Contents * [Jaeger analytics Java](#jaeger-analytics-java) + [Metrics](#metrics) - [Trace quality metrics](#trace-quality-metrics) + [Development](#development) + [Configuration](#configuration) * [Gremlin documentation](#gremlin-documentation) * [Spark Kafka documentation](#spark-kafka-documentation) * [Deploy Kafka, Elasticsearch and Jaeger on Kubernetes using operators](#deploy-kafka--elasticsearch-and-jaeger-on-kubernetes-using-operators) + [Expose Kafka outside of cluster and get host:port](#expose-kafka-outside-of-cluster-and-get-host-port) + [Expose Jaeger collector outside of the cluster](#expose-jaeger-collector-outside-of-the-cluster) + [Deploy Hotrod example application](#deploy-hotrod-example-application) * [Get exposed metrics](#get-exposed-metrics) * [Run Jupyter as docker](#run-jupyter-as-docker) + [Run on Mybinder](#run-on-mybinder) * [Using Jaeger in JUnit with Testcontainers](#using-jaeger-in-junit-with-testcontainers) Table of contents generated with markdown-toc

Jaeger analytics Java

Repository contains:

Blog posts, demos and conference talks:

Metrics

The library calculates various metrics from traces. The metrics are currently exposed in Prometheus format.

Currently these metrics are calculated:

network_latency_seconds_bucket{client="frontend",server="driver",le="0.005",} 32.0
network_latency_seconds_bucket{client="frontend",server="driver",le="0.01",} 32.0
network_latency_seconds_bucket{client="frontend",server="driver",le="0.025",} 32.0
service_height_total{quantile="0.7",} 2.0

Trace quality metrics

Trace quality metrics measure the quality of tracing data reported by services. These metrics can indicate that further instrumentation is needed or the instrumentation quality is not high enough.

These metrics are ported from jaeger-analytics-flink/tracequality. The original design stores results in separate storage table (Cassandra). The intention here is to export results as metrics and link relevant traces as exemplars (once OSS metrics APIs support that).

trace_quality_server_tag_total{pass="false",service="mysql",} 32.0
trace_quality_server_tag_total{pass="true",service="customer",} 26.0
trace_quality_minimum_client_version_total{pass="false",service="route",version="Go-2.21.1",} 320.0

Example Prometheus queries:

(trace_quality_server_tag_total{pass="true",service="customer",} / trace_quality_server_tag_total{service="customer",}) * 100
trace_quality_server_tag_total{pass="true",service="customer",} / ignoring (pass,fail) sum without(pass, fail) (trace_quality_server_tag_total)
// if values are missing
(trace_quality_server_tag_total{pass="true",service="mysql",}  / trace_quality_server_tag_total{service="mysql",} ) * 100 or vector(0)

alt text alt text

Development

Add annotation processor is needed for IDE configuration. It is used to generate trace DSL.

org.apache.tinkerpop.gremlin.process.traversal.dsl.GremlinDslProcessor

Build and run

mvn clean compile exec:java

Configuration

Configuration properties for SparkRunner.

Gremlin documentation

Spark Kafka documentation

Deploy Kafka, Elasticsearch and Jaeger on Kubernetes using operators

The following command creates Jaeger CR which triggers deployment of Jaeger, Kafka and Elasticsearch. This works only on OpenShift 4.x and prior deploying make sure Jaeger, Strimzi(Kafka) and Elasticsearch(from OpenShift cluster logging) operators are running.

oc create -f manifests/jaeger-auto-provisioned.yaml

If you are running on vanilla Kubernetes you can deploy jaeger-external-kafka-es.yaml CR and configure connection strings to Kafka and Elasticsearch.

Expose Kafka outside of cluster and get host:port

Expose Kafka IP address outside of the cluster:

listeners:
  # ...
  external:
    type: loadbalancer
    tls: false

Get external broker address:

oc get kafka simple-streaming -o jsonpath="{.status.listeners[*].addresses}"

Expose Jaeger collector outside of the cluster

oc create route edge --service=simple-streaming-collector --port c-binary-trft --insecure-policy=Allow

Deploy Hotrod example application

oc get routes # get jaeger collector route
docker run --rm -it -e "JAEGER_ENDPOINT=http://host:80/api/traces" -p 8080:8080 jaegertracing/example-hotrod:latest

Get exposed metrics

The streaming job exposes metrics on http://localhost:9001.

Run Jupyter as docker

The docker image should be published on Docker Hub. If you are modifying the source code of the library then inject it as volume -v ${PWD}:/home/jovyan/work or rebuild the image too see the latest changes.

make jupyter-docker
make jupyter-run

Open browser on http://localhost:8888/lab and copy token from the command line. Then navigate to ./work/jupyter/ directory and open notebook.

Run on Mybinder

Launch IJava binder Launch IJava lab binder

Using Jaeger in JUnit with Testcontainers

Artifact io.jaegertracing:jaeger-testcontainers contains an implementation for using Jaeger all-in-one docker container in JUnit tests:

JaegerAllInOne jaeger = new JaegerAllInOne("jaegertracing/all-in-one:latest");
jaeger.start();
io.opentracing.Tracer tracer = jaeger.createTracer("my-service");