IBM / sarama

Sarama is a Go library for Apache Kafka.
MIT License
11.6k stars 1.76k forks source link

Expose metrics using go-metrics #683

Closed slaunay closed 8 years ago

slaunay commented 8 years ago
Versions

Sarama Version: v1.9.0 Kafka Version: Go Version:

Problem Description

On the contrary to the Java client, no internal metrics are provided in order to monitor or optimize the performances (e.g. latency, throughput) like:

Some of those metrics are keys for tuning a Kafka producer over long fat networks or just benchmarking a consumer without relying on the broker metrics.

I was thinking of exposing a few useful metrics to the producer through Richard Crowley's Go port of Coda Hale's Metrics library. That library do not have transitive dependencies but provides built-in stats reporter (Graphite, OpenTSDB) as well as third party ones (e.g. InfluxDB) and can be disable using UseNilMetrics variable.

The idea would be to add a new public MetricRegistry Registry field to the Config struct that would defaults to metrics.DefaultRegistry and enrich it with metrics using similar name that the official documentation.

What do you think, any ideas on which metrics needs to be exposed first?

eapache commented 8 years ago

Sounds reasonable to me. I would start with a few easy ones as a proof of concept (for example bytes per producer request). Some of the others (such as ms spent in channels) would require tracking much more internal timing information, so I'd prefer to start with something less invasive.

slaunay commented 8 years ago

Sounds good, I'll try to work on a PR so we can discuss on the details.

hasnickl commented 8 years ago

maybe out of scope here but prometheus is also a popular metrics sink and being able to configure for either go-metrics or prometheus metrics would be useful

slaunay commented 8 years ago

From what I have read, Prometheus provides a time series database and client integration. It looks like the model is a bit different and requires sample information rather than summarized information that Dropwizard/go-metrics typically expose to reporters.

I believe that it is possible to publish metrics from go-metrics into Prometheus using their go integration but at the cost of losing data precision when using something else than a gauge or a counter.

It is an issue for Dropwizard's metrics library as well that might get tackle in future version as far as I know: https://github.com/dropwizard/metrics/issues/712

Richard Crowley's go-metrics library provides a simple yet powerful abstraction to publish metrics to most popular time series databases. I am not sure if adding an abstraction in between to support both Prometheus (or other time series database) and go-metrics is worth the added complexity over a limited yet simple Prometheus reporter plugged on top of go-metrics.

What do you think @eapache?

eapache commented 8 years ago

go-metrics provides the metrics abstraction I was kind of expecting. Unless someone provides a compelling argument that direct Prometheus integration provides substantial benefit I can't see it being worth the effort.

justinas commented 8 years ago

May I suggest defaulting to something like sarama.DefaultMetricRegistry rather than metrics.DefaultRegistry? I was surprised today as Sarama decided to flood the production Graphite instance with its own metrics.

Of course, this is first and foremost on me for both not pinning Sarama's version and relying on DefaultRegistry in my own app. But IMO one should have a reasonable expectation of a library not manipulating global registries like metrics.DefaultRegstry or http.DefaultServeMux, unless explicitly requested.

Although I understand it would be a breaking change to change this behavior now that it has landed.

slaunay commented 8 years ago

This is documented behavior: https://github.com/Shopify/sarama/blob/e03d23b5a4fa1a72c30eace0403d7c2c3b5de55e/config.go#L239

Using the global registry allows for easy access to the metrics and is the sensible default when using go-metrics but we provide a way to use a custom one which is probably what you ended up doing: https://godoc.org/github.com/Shopify/sarama#example-Config--Metrics

That being said this new feature is not part of an official release yet so changing it would not break backward compatibility, at least for people relying on http://gopkg.in/Shopify/sarama.v1.

eapache commented 8 years ago

I don't have a strong opinion here, but Sébastien is correct that we can still make breaking changes until I push a new major version with metrics. We only provide API stability guarantees between tagged versions.

I suspect it would be safer to default to a custom registry, on the same principle that libraries should not e.g. register global command-line flags or log to the global logger by default.

slaunay commented 8 years ago

I created #744 to switch to a local registry.

eapache commented 8 years ago

I believe this is now effectively done, please re-open if you still have work you want to track here.

slaunay commented 8 years ago

Sounds good, will submit separate PR for new metrics if necessary but the existing are really handy for monitoring and tuning producers.

Thanks for merging that feature @eapache.