spring-projects / spring-graphql

Spring Integration for GraphQL
https://spring.io/projects/spring-graphql
Apache License 2.0
1.5k stars 297 forks source link

Add metrics for subscriptions and WebSocket sessions #944

Open nlemoing opened 3 months ago

nlemoing commented 3 months ago

Hi! I'm trying to follow the Observability instructions to get GraphQL actuator metrics published. We're using a Datadog StatsD sidecar container, so I've added the following dependencies to my classpath:

I didn't do any additional configuration or create any additional beans.

To see what metrics were available, I set management.endpoints.web.exposure.include=* and went to /actuator/metrics, but didn't see any GraphQL metrics.

I have two questions:

sgrannan commented 2 months ago

@nlemoing We migrated from the graphql-kickstart project as well and I could not find anything. I had to implement my own ConcurrentHashMap that tracked sessions with my own custom Gauge using afterConnectionEstablished and afterConnectionClosed extended from the GraphQLWebSocketHandler

// Simple Metric for active connections Gauge.builder( AppConstants.CustomMetrics.WEBSOCKETS_ACTIVE_SESSIONS, this::getActiveSessions ).strongReference(true).register(Metrics.globalRegistry);

nlemoing commented 2 months ago

@sgrannan Thanks for the tip! I tried to use WebSocketGraphQlInterceptor to build my own metrics as well but it seems it was under-reporting connection close events. I think it might've under-reported because we were doing a stress test on the server and instances that were shutting down didn't report that their WebSocket connections were closing.

bclozel commented 2 months ago

Hello @sgrannan and @nlemoing

Thanks for this report and the discussion so far. I agree, this feature is currently missing in Spring for GraphQL and we should have a look at it. So far, we have instrumented the GraphQL engine with a graphql-java Instrumentation so that you'll get observations (timer metrics and traces) for requests and data fetching operations.

For such metrics, we can't depend on the Observation API because it only covers timers. For gauges and counters, we'll need the Metrics API and we must make that an optional dependency. We should look into providing a MeterBinder for active sessions. Something like what's provided by Micrometer for Netty allocators, for example.

In the case of active WebSocket sessions, we could expose the information on the WebSocket handler directly and make the binder quite straightforward. As for the subscriptions count, it's a different story since subscriptions can happen over multiple transports (SSE, WebSocket, RSocket) - ideally, a single metric should count them all and they should be tagged by transport.

I'll schedule this for the 1.3.x generation and I'll report back here after investigating a bit.