http4s / http4s

A minimal, idiomatic Scala interface for HTTP
https://http4s.org/
Apache License 2.0
2.52k stars 789 forks source link

Metrics middleware: compliance with OpenTelemetry spec #7448

Open iRevive opened 1 month ago

iRevive commented 1 month ago

Context

OpenTelemetry spec has well-defined HTTP semantic conventions, such as the naming of the metrics, required attributes, etc. http4s existed way before OpenTelemetry was established and, moreover, the HTTP semantic conventions became stable recently.

Problems

1. Missing attributes

The MetricsOps interface does not provide enough request details to fill out all the necessary attributes.

For example, http.server.request.duration requires quite a lot of attributes, while we provide only a few: Name Example Requirements Does MetricsOps provide it
http.request.method GET, POST Required
url.scheme httphttps Required  ❌
error.type java.net.UnknownHostException Required If request has ended with an error  ❌
http.response.status_code 200, 404 Conditionally Required If and only if one was received/sent.
http.route /users/:userID?, {controller}/{action}/{id?} Conditionally Required If and only if it’s available  ❌
network.protocol.name httpspdy Conditionally Required  ❌
network.protocol.version 1.01.123 Recommended  ❌
server.address example.com10.1.2.80 Opt-In
server.port 808080 Opt-In

There is a similar story with other metrics too.

2. Unnecessary metrics

For example, we have our own http.server.abnormal_terminations metric to track failed requests. If we populate error.type in the http.server.request.duration metric, it's enough to identify failing requests.

Questions

Should we follow the OpenTelemetry spec?

We may not be able to provide all the required attributes.

On the other hand, OpenTelemetry requires a lot of details. So the MetricsOps interface should be generic enough to plug any other third-party tool (e.g. Datadog, Prometheus, etc).

How to deal with experimental HTTP metrics?

For example, http.server.request.body.size is valuable. However, since it's experimental, the requirements may change over time: e.g., some attributes could be renamed or become mandatory.

On the other hand, the standalone modules, such as http4s-otel4s-middleware and http4s-prometheus-metrics, provide the actual implementation.

So, the MetricsOps should provide enough details to fill out the metrics, and it's up to the implementation to decide whether to use a specific metric.

What about the 0.23 series?

These changes are breaking in many ways, and I'm unsure if we can backport them in a binary-compatible way. We can still make MetricsOps2 and Metrics2 though.

iRevive commented 1 month ago

What's your opinion on extending MetricsOps to provide enough details so the http4s-otel4s-middleware can provide OpenTelemetry-compliant metrics?

ybasket commented 1 month ago

What's your opinion on extending MetricsOps to provide enough details so the http4s-otel4s-middleware can provide OpenTelemetry-compliant metrics?

Strongly in favor of working towards OTEL compatibility, that's a big win these days. Thank you for your initiative (and work on otel4s)!

MetricOps2 doesn't sound too bad for 0.23, especially as it should be a "superset" of MetricOps, so one can convert one-way, or am I missing something?