openzipkin / brave

Java distributed tracing implementation compatible with Zipkin backend services.
Apache License 2.0
2.36k stars 713 forks source link

KafkaStreamsTracing update to align with new Kafka Streams API #1365

Closed dermotmburke closed 1 year ago

dermotmburke commented 1 year ago

Feature

The current "Transform" and "Process" methods on the Kafka Streams KStream class are now deprecated in favour of simpler "Process" and "ProcessValues" methods (see https://cwiki.apache.org/confluence/display/KAFKA/KIP-820%3A+Extend+KStream+process+with+new+Processor+API). Is there any plan to update the existing KafkaStreamsTracing class to emulate this new approach?

jcchavezs commented 1 year ago

Ping @jeqo

On Thu, 16 Mar 2023, 12:14 dermotmburke, @.***> wrote:

Feature

The current "Transform" and "Process" methods on the Kafka Streams KStream class are now deprecated in favour of simpler "Process" and "ProcessValues" methods. Is there any plan to update the existing KafkaStreamsTracing class to emulate this new approach?

— Reply to this email directly, view it on GitHub https://github.com/openzipkin/brave/issues/1365, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYATHBEXJ6TE4QTB6WILW4LY2JANCNFSM6AAAAAAV5B55X4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jeqo commented 1 year ago

Thanks @dermotmburke. Agree, this would be useful and will require using the headers available on the new Record abstraction. Let me know if you'd be up to contribute this one, I'd be happy to review. Else I could take a look at this in the upcoming weeks.

dermotmburke commented 1 year ago

Hi @jeqo,

I took a look at this over the weekend and started putting some code together based on what is already there. My main issue is around implementing a new TracingProcessor class that implements org.apache.kafka.streams.processor.api.Processor. There is already a class called TracingProcessor in the brave.kafka.streams package. This leaves me the following options

  1. Creating a new class (called something like NewTracingProcessor).
  2. Making the existing TracingProcessor implement both the org.apache.kafka.streams.processor.api.Processor and the new org.apache.kafka.streams.processor.api.Processor.
  3. Adding a new package (called something like brave.kafka.streams.processor.api).

I was leaning toward option number 3 but noticed that the existing TracingProcessor needs to access package private attributes of the KafkaStreamsTracing class. Option 2 seems a bit messy as the code would need to maintain 2 delegate processors internally (one of each processor type).

Any other ideas or suggestions on what I could name the new tracing processor class would be greatly appreciated.

jeqo commented 1 year ago

Thanks @dermotmburke, this is a great starting point. I agree with your assessment. I would add that Option 2 would potentially break compatibility because 2 additional types need to be added to the generic class (K, V -> KIn, VIn, KOut, VOut). Option 3 would break encapsulation, and will also look a bit weird that we only have subpackages for this specific implementations and not for others.

Option 1 seems the ugliest (yet another name), but most effective one. I'm thinking something like TracingProcessorApi or TracingNewProcessor or simply TracingProcessorV2? Maybe we can discuss this on the PR. @jcchavezs @shakuzen do you have any thoughts on this?

dermotmburke commented 1 year ago

I have created a draft PR here https://github.com/openzipkin/brave/pull/1367/files - I'm having some trouble testing locally. I keep getting the error

[INFO] Checking licenses... [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/test/java/brave/kafka/streams/ITKafkaStreamsTracing.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/test/java/brave/kafka/streams/KafkaStreamsTracingTest.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/NewTracingProcessor.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/TracingFixedKeyProcessor.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/test/java/brave/kafka/streams/KafkaStreamsTest.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/KafkaStreamsTracing.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/NewTracingProcessorSupplier.java [WARNING] Missing header in: /Users/dermotburke/Code/brave-fork/instrumentation/kafka-streams/src/main/java/brave/kafka/streams/TracingFixedKeyProcessorSupplier.java

even though the license header exists in all those files.

jeqo commented 1 year ago

Usually its caused because of year difference. You can run ./mvnw com.mycila:license-maven-plugin:format to generate proper value.

dermotmburke commented 1 year ago

Thanks for the pointer @jeqo - that seemed to solve the issue. I'm afraid I went down a bit of a rabbit hole with the PR. The upgrade to kafka version 3.4.0 broke the integration tests using https://github.com/charithe/kafka-junit. Unfortunately, there does not seem to be a version of https://github.com/charithe/kafka-junit that supports kafka version 3.4.0.. To solve the issue I tried replacing https://github.com/charithe/kafka-junit with https://github.com/salesforce/kafka-junit. This seemed to work fine for all tests except 2 in the brave.kafka.clients.ITKafkaTracing class (poll_creates_one_consumer_span_per_extracted_context and poll_creates_one_consumer_span_per_topic). Not entirely sure how to work around the issue - any help would be greatly appreciated :)

jeqo commented 1 year ago

Replacing that dependency hasn't been easy in the past, and may require further work. I have requested an upgrade to kafka-junit: https://github.com/charithe/kafka-junit/pull/69. It usually gets merged quite fast and we can use the latest version. Hopefully this remove the blockers.

dermotmburke commented 1 year ago

Thanks @jeqo - I would like to keep the surface area of the change down so I think waiting makes sense. I'll keep an eye on your requested change. Thanks again

jeqo commented 1 year ago

PR merged, new release on its way: https://github.com/charithe/kafka-junit/releases/tag/kafka-junit-4.2.4

dermotmburke commented 1 year ago

Great - thanks @jeqo. I've marked the PR as ready for review

https://github.com/openzipkin/brave/pull/1367

jeqo commented 1 year ago

@dermotmburke, promise to take a look this week. thanks!