petabridge / phobos-issues

Public issues and bug tracker for Phobos®
https://phobos.petabridge.com/
2 stars 1 forks source link

Error when reporting to App Metrics statsd to DataDog #15

Closed yohanb closed 3 years ago

yohanb commented 3 years ago

Hi, not sure if the bug is related to Phobos, AppMetrics or Dogstatsd but we are getting this error:

Jan 30 20:19:45 *** agent[3330389]: 2021-01-30 20:19:45 EST | CORE | ERROR | (pkg/dogstatsd/server.go:438 in errLog) | Dogstatsd: error parsing metric message '"Akka.NET.akka.messages.recv.meter.value:606|m"': invalid metric type: "m"

Thanks!

Aaronontheweb commented 3 years ago

Hi Yohan,

We have some experimental DogStatsD support added here: https://github.com/petabridge/Petabridge.Phobos.Web.DataDog/pull/11 - and we don't appear to run into this error.

Is this issue coming up when you're using the App.Metrics.StatsD reporter?

yohanb commented 3 years ago

@Aaronontheweb yeah it was with the statsd reporter.

Aaronontheweb commented 3 years ago

ok, thanks @yohanb - we're working on submitting some other patches to the StatsD reporter for App.Metrics right now. I'll open up an issue over there that references this one and let you know when we have a fix.

Arkatufus commented 3 years ago

@yohanb Can I ask what metrics server you're using right now? According to the spec, DogStatsD should support the meter metrics (m)

Arkatufus commented 3 years ago

@yohanb if you're using veneur, then yes, you will get this error. Veneur doesn't support meter metrics.

yohanb commented 3 years ago

@Arkatufus nah I was getting this error with datadog's agent service.

Arkatufus commented 3 years ago

@yohanb I just looked through datadog source code, it turns out that they don't support their own StatsD spec.

This is the list of metrics that I use as a guideline when I code the App.Metrics StatsD exporter: https://github.com/DataDog/dogstatsd-csharp-client/blob/master/src/StatsdClient/Serializer/MetricSerializer.cs#L10 as you can see, meter metrics is listed there.

These are the metrics that are supported by datadog agent: https://github.com/DataDog/datadog-agent/tree/master/pkg/metrics as you can see, meter.go is missing from the file list, its not even listed in the README.md

So my only suggestion is that, for now, don't use any meter metrics if you're exporting to datadog via statsd.

Arkatufus commented 3 years ago

Here's a way to filter all meters from reporting by modifying the configuration:

.Report.ToStatsDUdp(b => {
    b.SocketSettings.Address = {server address};
    b.SocketSettings.Port = {server port};
    b.StatsDOptions.MetricNameFormatter = new DefaultDogStatsDMetricStringSerializer();
    b.Filter = new MetricsFilter().WhereType(MetricType.Apdex, MetricType.Counter, MetricType.Gauge, MetricType.Histogram, MetricType.Timer);
})
yohanb commented 3 years ago

thanks @Arkatufus !

Aaronontheweb commented 3 years ago

So we're currently using meters inside Phobos to support the akka.actor.recv.msgType metrics. I can actually see those values show up when I run https://github.com/petabridge/Petabridge.Phobos.Web.DataDog/pull/11

We're working with another Phobos customer that uses DataDog and we're seeing this issue with them too - it looks like meter metrics, once the traffic volume gets high enough, aren't being recorded. I'll do some more digging, but I'd like to keep this issue open so we can maintain a record of it.

Aaronontheweb commented 3 years ago

This might be an issue with the App.Metrics.StatsD library.

It looks like DogStatsD doesn't formally include any definition of a meter - and in fact that's not part of the canonical open source StatsD specification: https://github.com/statsd/statsd/blob/master/docs/metric_types.md

However, you can see in the DataDog documentation that they have an internal type available only via their web API, called a "Rate" which performs the exact same function as a meter: https://docs.datadoghq.com/developers/metrics/types/?tab=count

Inside the App.Metrics.DataDog library, which bypasses DogStatsD to talk to DataDog's HTTP API directly they transforms meters directly into rate objects:

The following metric submission types are accepted:

COUNT RATE GAUGE SET HISTOGRAM DISTRIBUTION

These different metric submission types are mapped to four in-app metric types found within the Datadog web application:

COUNT RATE GAUGE DISTRIBUTION

There's also a table at the bottom that explains the conversion between a DogStatsD accepted metric type and the internal DataDog type https://docs.datadoghq.com/developers/metrics/types/?tab=rate#submission-types-and-datadog-in-app-types

DogStatsD dog.count(...) COUNT RATE
DogStatsD dog.increment(...) COUNT RATE
DogStatsD dog.decrement(...) COUNT RATE

So it looks like inside App.Metrics we need to convert all meter metrics to counters inside the DogStatsD formatter, in order to resolve this problem.

Aaronontheweb commented 3 years ago

I think we need to just use the "increment" functions here in order to turn counters into rates in DataDog:

https://docs.datadoghq.com/developers/metrics/dogstatsd_metrics_submission/#count

Arkatufus commented 3 years ago

Right now, in the AppMetrics implementation, all unsopported metrics are converted to Gauge. I can always add an exception and convert Meter to Count

Aaronontheweb commented 3 years ago

Which metrics get reset upon flush in the App.Metrics implementation?

Arkatufus commented 3 years ago

That would depend on how App.Metrics treat the metric data, all the reporter and formatter class does is take whatever bundle of data that got passed to it by the core classes and convert it into the proper output format, ie. it doesn't really process the flush.

Aaronontheweb commented 3 years ago

So here's the mapping in the App.Metrics.DataDog library: https://github.com/AppMetrics/AppMetrics/blob/86200b959356c54e83db965cab133f4162ce1347/src/Reporting/src/App.Metrics.Formatting.Datadog/Internal/DatadogSyntax.cs#L45

And where that data actually gets formatted

https://github.com/AppMetrics/AppMetrics/blob/86200b959356c54e83db965cab133f4162ce1347/src/Reporting/src/App.Metrics.Formatting.Datadog/DefaultDatadogMetricJsonWriter.cs#L80-L83

So I think we can just do the same - convert the data included inside the meter type into an equivalent StatsD counter type and adapt it to the type of output we expect. Should we create a matching issue on AppMetrics?

Arkatufus commented 3 years ago

I'll look into it, see if we can emulate the same behaviour for StatsD

Aaronontheweb commented 3 years ago

This has been resolved via https://github.com/AppMetrics/AppMetrics/issues/668