openzipkin-attic / sleuth-webmvc-example

See how much time Spring Boot services spend on an http request.
Apache License 2.0
175 stars 107 forks source link

Shows how to enable kafka #8

Closed codefromthecrypt closed 6 years ago

codefromthecrypt commented 7 years ago

It might be the case that the sleuth or zipkin docs aren't explicit enough with regards to how to use kafka. I'm trying to show here in case we can do something better, maybe explain more what's going on or similar cc @marcingrzejszczak @trisberg @Imflog @shakuzen @openzipkin/devops-tooling

Per sleuth docs, we add the dependency "spring-kafka" and set spring.zipkin.sender.type to kafka to enable kafka. We don't explain things already covered in zipkin, such that kafka is running and zipkin is connected to it, or how to test that anything mentioned is true. I'm going to repeat some things here in docs just to make sure we are on the same page in case someone in the future asks, perhaps we can somehow improve docs.

TL;DR;

To start, you don't need to do anything very special. You put the jar in sleuth's classpath as noted in its README (and the only change in this example PR). Add the KAFKA_BOOTSTRAP_SERVERS property to a stock zipkin server (as noted in its README). Assuming your versions are recent, kafka setup is ok, etc, you are good to go.

Producer-side: in this case spring-cloud sleuth

How does someone learn how to connect sleuth to Kafka?

https://cloud.spring.io/spring-cloud-sleuth/single/spring-cloud-sleuth.html#_sleuth_with_zipkin_over_rabbitmq_or_kafka says to add the spring-kafka dep and set and set spring.zipkin.sender.type to kafka.

It doesn't mention that sleuth needs to be at least 1.3, or that the spring-kafka dep has to transitively include at least a 0.10 client. However, if you use the sleuth 1.3+ or 2.0 it certainly connects to kafka and sends messages.

How does someone learn if the connection between Sleuth and kafka are working?

Basic health

Basic health isn't documented in sleuth, as what choices are available are not known ahead of time (ex if health endpoints will be configured etc). One undocumented side-effect is that when the first span is supposed to go to zipkin, a log statement should emit showing lazy initialization of the kafka component.

2018-06-19 14:00:21.048 2018-06-19 14:00:21,048 [/] INFO  [AsyncReporter{KafkaSender{bootstrapServers=192.168.99.100:9092, topic=zipkin}}] o.a.k.c.u.AppInfoParser$AppInfo - Kafka version : 1.0.1
 42038 --- [ topic=zipkin}}] o.a.kafka.common.utils.AppInfoParser     : Kafka version : 1.0.1

Health of message production

Sleuth includes an adapter from zipkin to its SpanMetricReporter. If there were problems sending spans, drop metrics would increase (if metrics are setup).

Consumer-side: Zipkin

How does someone learn how to start zipkin

The quick start says to download the jar or run docker https://github.com/openzipkin/zipkin#quick-start It doesn't continue on to elaborate every configuration option, rather point to the zipkin-server README which begins that discussion.

How does someone learn how to connect zipkin to Kafka?

The zipkin community are well trained to point anything server related to the zipkin-server README here https://github.com/openzipkin/zipkin/tree/master/zipkin-server#kafka-collector In this case, it says the minimum steps are to set KAFKA_BOOTSTRAP_SERVERS and the example command literally does this for you KAFKA_BOOTSTRAP_SERVERS=127.0.0.1:9092 java -jar zipkin.jar

How does someone learn if the connection between zipkin and kafka are working?

Basic health

The zipkin-server README mentions two sorts of things to know if the connection is working or not. Firstly, to hit the health endpoint ex curl -s localhost:9411/health. Right now, the health check might be misleading. Look at the log entries instead. If you see something like below, the server isn't connecting to the broker:

2018-06-19 13:49:07.749  WARN 37556 --- [pool-2-thread-1] o.a.k.c.NetworkClient                    : [Consumer clientId=consumer-1, groupId=zipkin] Connection to node -1 could not be established. Broker may not be available.

Health of message consumption

Our zipkin-server README discusses collector metrics https://github.com/openzipkin/zipkin/tree/master/zipkin-server#collector When the collector is up, and you have sleuth or something else setup properly, metrics increase per-transport, including failures. If you see no metrics at all, there could be a subtle problem. For example, if this returns nothing.. there's a problem curl -s localhost:9411/metrics|jq .|grep kafka

Advanced and Troubleshooting

Beyond above, there's the main versions dance. Especially as sleuth support is new, make sure you are using the very latest sleuth. Zipkin server can be older.

Further information on the server

https://github.com/openzipkin/zipkin/tree/master/zipkin-autoconfigure/collector-kafka includes more advanced notes. For example, some hints about normal kafka things you might need, such as reseting offsets.

codefromthecrypt commented 6 years ago

closed in favor of wiki https://cwiki.apache.org/confluence/display/ZIPKIN/Kafka