logstash-plugins / logstash-input-kafka

Kafka input for Logstash
Apache License 2.0
139 stars 122 forks source link

Add support for specifying "interceptor.classes" setting #139

Open consulthys opened 8 years ago

consulthys commented 8 years ago

It would be nice to add the ability to specify the interceptor.classes consumer config setting. The idea behind that is to be able to (e.g.) hook the logstash kafka input plugin (i.e. a kafka consumer) to some kind of monitoring system, such as (e.g.) the Confluent Control Center (CCC).

The only unknown at this point is how to package the following dependency into this plugin as transparently as possible.

    <dependency>
        <groupId>io.confluent</groupId>
        <artifactId>monitoring-interceptors</artifactId>
        <version>3.0.1</version>
    </dependency>

For people using the CCC, the Message Delivery chart will only show up if both producers and consumers of a given topic are being monitored, since the goal is to illustrate the gap between produced and consumed messages. So if an application produces messages into a Kafka topic that is consumed by Logstash, the CCC will only be useful if the kafka input plugin can be instrumented via interceptor.classes, otherwise CCC will not be aware of the message consumption on that topic.

Maybe someone has a better idea on how to hook up and monitor a Logstash Kafka consumer via interceptor classes (or any other mean).

consulthys commented 7 years ago

@ph do you have any thoughts about this?

talevy commented 7 years ago

@consulthys, not sure anyone on the LS team has experience using the Confluent Control Center. If you wish to introduce this functionality into the plugin, that would be great!

consulthys commented 7 years ago

Thanks for dropping by @talevy ! The CCC is just one possible candidate which could benefit from the interceptor.classes consumer config setting. Read more about the genesis of that feature

Time allowing, I'll have a look at how this could be done and eventually submit a PR.

talevy commented 7 years ago

thanks!

consulthys commented 7 years ago

I've managed to make this work. The only problem is that the additional JAR that needs to be included/vendored isn't available on the official Maven repo (but on Confluent's repo at http://packages.confluent.io/maven/) which means that rake install_jars won't find it.

To make it work, I've added the Confluent repo into my settings.xml file, but this will neither work for other people nor for the automated build process. Has anyone an idea on how to go around this?

ph commented 7 years ago

@consulthys Nothing prevent us from using gradle here to manage that dependency?

consulthys commented 7 years ago

@ph thanks for your answer. So you mean switching to gradle instead of rake to manage the dependent JAR libraries in this plugin?

ph commented 7 years ago

@consulthys Yup, gradle already allow you to manage multiple maven repo out of the box, if you look at the beats this is what we use.

There is only one caveat, you need to make sure that bundle exec rake vendor trigger the right gradle task to package the jars. This because our publish bot only know about that task and nothing about gradle. If you look at the beats Rakefile you will see how we do it.

consulthys commented 7 years ago

Thanks for the guidance @ph, I'll have a look and see how this can be done.

consulthys commented 7 years ago

So we have a Spring Integration application that produces bulk ES messages into a Kafka topic and a Logstash process that consumes that topic. Both the Java producer and the Logstash consumer have the interceptor.classes setting enabled so they can send monitoring metrics to the Confluent Control Center.

Here is a quick glimpse at what this new feature enables: real-time topic production/consumption monitoring ;-)

capture d ecran 2017-06-13 a 09 36 48

consulthys commented 7 years ago

One issue I see with this, though, is that if we allow anyone to specify this setting they can only reference a Java class that's been pre-packaged with the plugin (currently only the Confluent monitoring-interceptors package). This makes this feature a bit awkward to use.

Does anyone have any idea on how to go around this limitation?