enonic / xp

Enonic XP
https://enonic.com
GNU General Public License v3.0
201 stars 34 forks source link

Use micrometer for collecting metrics #7953

Open gbbirkisson opened 4 years ago

gbbirkisson commented 4 years ago

To be more flexible when it comes to metrics I would like to suggest that we use micrometer.io to collect metrics.

Micrometer provides an abstraction that makes it easy to expose metrics to multiple monitoring services, i.e. datadog, prometheus, elastic and many more.

We currently use Dropwizard to collect metrics, and micrometer can easly wrap that. So the change we need to make to XP now is not so big.

If we implement a "metrics service" with micrometer it would enable developers to add their own custom metrics. In addition we could create apps that ship metrics to all these different monitoring services.

This would also help with pod autoscaling in the cloud since prometheus is the standard for kubernetes cluster and we need to setup internal cluster metrics to enable that feature.

sigdestad commented 4 years ago

Sounds like a plan, but why not start by implemeting it as an app, then it can be added to core when it is mature?

rymsha commented 4 years ago

This is not an app. App has no access to internals which micrometer needs

sigdestad commented 4 years ago

How do we currently control that?

rymsha commented 4 years ago

It is on different jetty port and each subsystem reports what it wants to report.

sigdestad commented 4 years ago

I think I understand the purpose of the task, but I am not sure if I understand "how" we should do this? What will be the result of such a change? replacing the current API? Something else?

rymsha commented 4 years ago

That is a good question. Sadly com.codahale.metrics is part of our core java API. So removing it would be a breaking change. (we wait with this til 8)

Micrometer is not a drop-in replacement either. I have done it before (as many others who migrated from spring boot 1 to spring boot 2) - it was pain.

Good news we use com.codahale.metrics only for collecting stats about Events and (in upcoming release, if we don't remove it) stats from Jetty. Some other simple endpoints as well (threaddump for instance), but they are easy to replicate.

Micrometer is usually not an endpoint (although one can expose collected metrics via http). Micrometer usually "ships" metrics to metrics registry (datadog, Elasticsearch, etc) so it is configuration of XP how often to push metrics and where. While dropwizard collects metrics "silently" and waits for monitoring systems to pull them.

In my mind this is an epic which consists:

gbbirkisson commented 4 years ago

So removing it would be a breaking change

Are you sure? Because micrometer provides a wrapper for that, we might not have to break anything.

Micrometer is usually not an endpoint

That depends on the monitoring system you use. I think we might be able to do the exact thing we are doing today with micrometer.

I would say the epic is:

To sum up: All systems send metrics to the Composite registry. Composite Registry has children registries:

This way both the subsystems creating metrics and the Component registry do not care about how the metrics are shipped, and people can choose how they ship all of their metrics. The can even choose multiple methods.

sigdestad commented 4 years ago

I like this approach, especially cool if we could have apps for the various "target" metric systems.

rymsha commented 4 years ago

Are you sure?

Yes. Micrometer has DropwizardMeterRegistry wrapper. So one can use dropwizard api, like we do modules/core/core-api/src/main/java/com/enonic/xp/util/Metrics.java
But keeping two metrics APIs is confusing (maybe OK for 7.x, but not for 8.x) Keeping only old one is limiting (tagging conventions fro micrometer are different). Removing old one is breaking.

Change all metric collection code to send metrics to the new Composite registry

This is where it starts to be a problem. Dropwizard is "hardcoded" in our API.

Add a Dropwizard registry as a child to the Composite Registry and expose the metrics with that on our old metrics endpoint.

I found only Events using that metrics internally. What customers expose via API, I'm not sure. Maybe better to have a separate endpoint /metrics and leave /status alone

Prometheus (App on market): Exposes metrics on :2609/prometheus

I don't think it is possible (need to check). 8080 is ok.

gbbirkisson commented 4 years ago

I don't think it is possible (need to check). 8080 is ok.

It is possible, I did it when I first started in Enonic.

@Component
public class PrometheusReporter
    implements StatusReporter
{

    private CollectorRegistry registry;

    @Override
    public String getName()
    {
        return "micrometer.prometheus";
    }

    @Override
    public MediaType getMediaType()
    {
        return MediaType.PLAIN_TEXT_UTF_8;
    }

    @Override
    public void report( OutputStream stream )
        throws IOException
    {
        Writer w = new OutputStreamWriter( stream );
        TextFormat.write004( w, registry.metricFamilySamples() );
        w.close();
    }

    @Reference
    public void setRegistry( final PrometheusRegistry registry )
    {
        this.registry = registry.getPrometheusRegistry();
    }

}
rymsha commented 4 years ago

It is possible, I did it when I first started in Enonic.

Ah, you are right. I thought :2609 always starts with /status. (I just recently implemented Hazelcast "status", that's probably why)

This way both the subsystems creating metrics and the Component registry do not care about how the metrics are shipped, and people can choose how they ship all of their metrics

Sounds great on paper, doesn't always work in practice: https://github.com/micrometer-metrics/micrometer/issues/587 It is fixed for now (and in 7.x we are probably safe), but will appear in future again. So DropwizardRegistry should be replaced with SimpleMeterRegistry in 8.x to prevent version locking.

sigdestad commented 4 years ago

Let's do it!

rymsha commented 4 years ago

8086 is currently a blocker to make this as an app