apollographql / router

A configurable, high-performance routing runtime for Apollo Federation 🚀
https://www.apollographql.com/docs/router/
Other
800 stars 268 forks source link

Implement HTTP metrics for Connectors #6067

Open pubmodmatt opened 2 days ago

pubmodmatt commented 2 days ago

Add standard metrics for connectors HTTP requests and responses:

Selectors are defined for:

The implementation uses the HTTP client service, as a single call to the connector service can result in multiple HTTP calls. The metrics make more sense at the HTTP client level anyway. However, since the instrumentation types are generic over a single request/response type, this means we'll be unable to have any other instruments at the connector service level that use the same selectors. However, that should be fine as the interesting data is at the HTTP level for connectors.

Since the HTTP client service is not public, this required making the telemetry plugin private. This in turn required some small changes to the plugin test harness, which was previously incompatible with private plugins.

Planning ahead for future Connectors beyond the HTTP REST API connector, the type names include Http where appropriate so that additional types can be created in the future without conflict or confusion. For example, in addition to ConnectorHttpSelector, there might eventually be a ConnectorGrpcSelector. These need to be different types, because the Selector implementation has associated request and response types. These are HttpRequest and HttpResponse for the HTTP implementation but would be something else for gRPC.

Examples

Add a metric for the number of 404 response codes from a particular connector source API:

not.found.count:
  value: unit
  type: counter
  unit: count
  description: "Count of 404 responses from the user API"
  condition:
    all:
      - eq:
        - 404
        - connector_http_response_status: code
      - eq:
        - "user_api"
        - connector_source: name

Add connector attributes to the http.client.request.duration instrument:

http.client.request.duration:
  attributes:
    subgraph.name: true
    connector.source:
      connector_source: name
    connector.http.method: true
    connector.url.template: true

Create a histogram of the remaining rate limit from an API based on a response header value:

rate.limit:
  value:
    connector_http_response_header: "x-ratelimit-remaining"
  unit: count
  type: histogram
  description: "Rate limit remaining"
  condition:
    eq:
      - "user_api"
      - connector_source: name

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Notes

[^1]: It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]: Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]: Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

bnjjj commented 1 day ago

cc @BrynCooke I need your review here too