DataDog / kafka-kit

Kafka storage rebalancing, automated replication throttle, cluster API and more
Apache License 2.0
487 stars 54 forks source link

Prometheus support #197

Open saada opened 5 years ago

saada commented 5 years ago

Great to see support for Datadog API. Supporting Prometheus would increase adoption and make the metrics source pluggable.

jamiealquiza commented 5 years ago

The metrics backend is pluggable, but no other support has been written yet. I think Prometheus would be the most desirable addition.

williamhammond commented 5 years ago

Has work been started on this?

jamiealquiza commented 5 years ago

No work has been started, but I do have an idea for the metricsfetcher tool (in case that's the one we care about here). Rather than produce a sprawling array of tools, I want to refactor this one with Cobra and support each provider (Datadog, Prometheus, etc) as a sub command since there will probably be a lot of config to encapsulate for each.

If somebody has a test environment with a Prometheus API I could access (where Kafka brokers are actually being monitored), it could help get the development of this rolling.

In the meantime, ad-hoc tools will actually work as well, which shouldn't be difficult to script. No matter how you're sourcing metrics, if you can write them to ZooKeeper in the format described in the metricsfetcher README, topicmappr can use it for storage based partition assignment.

williamhammond commented 5 years ago

I'd like prometheus support soonish, mind if i take a shot at it?

jamiealquiza commented 5 years ago

@williamhammond Sure thing, the easiest route would be to make a standalone tool like metricsfetcher that grabs your Prometheus metrics and writes it into ZK according to the format described in the metricsfetcher README.

vrischmann commented 5 years ago

Hello,

We started using kafka-kit at Batch (well topicmappr actually) and needed a way to get Prometheus metrics from our brokers, so I wrote a replacement metricsfetcher. It's nothing fancy but does the job, I figured some of you here might be interested.

I would also like to experiment using autothrottle but the DataDog handler is hardcoded and I don't see a way around that so I'll have to use an internal build that replaces the handler.

jamiealquiza commented 5 years ago

@vrischmann awesome! Having community alternatives of metricsfetcher is great. Eventually, I can better organize this repo to include them or at least link to them.

The autothrottle metrics is backed with a Datadog implementation of the kafkametrics.Handler interface. We'd add additional implementations in the kafkametrics library and replace that entry in autothrottle to switch on the implementation types. If you do end up working on a prometheus implementation of the Handler interface, I can update autothrottle to be configured to use various types via configuration.

vrischmann commented 5 years ago

Got it. I'll make a PR implementing kafkametrics.Handler if I end up working on it and have something that works.

tarvip commented 4 years ago

Following @vrischmann example I wrote similar metricsfetcher Main difference is that it fetches data directly from Prometheus.

jamiealquiza commented 4 years ago

Nice work, I'll get a 3rd party tools reference going and include this.

ls-serge-sozonoff commented 4 years ago

I have completed the basics for a metricsfetcher implementation for AWS Cloudwatch support. Was wondering if anything has been done yet around the switch to choose implementation yet ?

leosunmo commented 4 years ago

@tarvip 's tool is excellent! Successfully ran it against a large cluster and rebuilt with topicmappr, worked great! Make sure you add this tool in a prominent place in the README. I was lucky to stumble upon it here.

jamiealquiza commented 4 years ago

I have completed the basics for a metricsfetcher implementation for AWS Cloudwatch support. Was wondering if anything has been done yet around the switch to choose implementation yet ?

I think because the tool is so simple that it might make sense to just offer a variety of standalone options and link them in the metricsfetcher readme. I'm adding this section shortly, so if you have something you'd like added, I'll gladly do so!

jamiealquiza commented 4 years ago

Forgot to reference this issue in the PR, but started this section here: https://github.com/DataDog/kafka-kit/tree/master/cmd/metricsfetcher#third-party-variations