apollographql / router

A configurable, high-performance routing runtime for Apollo Federation 🚀
https://www.apollographql.com/docs/router/
Other
798 stars 267 forks source link

Metrics: Configurable histogram bucket dimensions #2333

Closed klippx closed 1 year ago

klippx commented 1 year ago

Is your feature request related to a problem? Please describe. I want to be able to adapt histogram metrics to better suit the reality of my supergraph.

Describe the solution you'd like It would be possible to configure histogram bucket dimensions for http_request_duration_seconds. The default ones doesn’t work well for us, minor things such as we want 0.25 instead of 0.2 and 0.3, and we also need 20 as the biggest bucket (unfortunately we have very slow dependencies that can be over 10seconds and those get “clipped” at 10sec now so we can’t see how slow they really are).

As a parallell, in @opentelemetry/api you can programatically configure the histogram:

  // Create histogram dimensions
  const aggregation = new ExplicitBucketHistogramAggregation([
    50, 100, 250, 500, 1000, 2500, 5000, 10000, 20000,
  ]);
  // "Pin" each individual metric to whatever histogram dimensions
  const meterProvider = new MeterProvider({
    resource,
    views: [
      // Give HTTP auto-instrumentation some bucket tuning:
      // See https://github.com/open-telemetry/opentelemetry-js/blob/main/experimental/packages/opentelemetry-instrumentation-http/src/http.ts
      new View({
        instrumentName: 'http.client.duration',
        aggregation,
      }),
      new View({
        instrumentName: 'http.server.duration',
        aggregation,
      }),
    ],
  });

☝️ Both the bin sizes and the number of bins are configurable, which is great. Can you do something similar to that?

Maybe something like:

telemetry:
  metrics:
    common:
      http_request_duration_seconds:
        buckets:
        - 0.05
        - 0.10
        - 0.25
        - 0.50
        - 1.00
        - 2.50
        - 5.00
        - 10.00
        - 20.00

Not sure how to best add this, but if more metrics are added in the future, those can be configured as well (by named reference) without breaking the yaml schema.

Describe alternatives you've considered Live with the pre-configured buckets.

Additional context None.

marc-barry commented 1 year ago

We have switched from Apollo Server to Apollo Router and this is something that is missing for us. It is very hard to "guess" histogram buckets as this is very much a customizable thing based on the backend response times.

bnjjj commented 1 year ago

For now we're waiting to land this PR #2358 before tackle this :)

klippx commented 1 year ago

2358 is merged, is this work going to commence in a nearby sprint?

bnjjj commented 1 year ago

I just opened a PR sorry for the delay.

abernix commented 1 year ago

We believe there's a release coming out tomorrow, so this may also make it into that version assuming it makes it through review, etc.!