Open HunterLarco opened 1 year ago
This is possibly related to https://github.com/open-telemetry/opentelemetry-js-contrib/issues/1686. Apollo sets every resolver to a custom one, even if it's trivial, which causes extremely verbose traces.
I was facing the same issue when using otel via Sentry v8
If it helps, I currently disable the graphql integration (since I use Apollo Server) and setup a plugin and instrument manually via an Apollo plugin. It's possible to detect the trivial resolver case using this: https://community.apollographql.com/t/is-there-a-way-to-know-if-a-graphql-field-will-be-resolved-using-the-default-resolver/2573 and manually instrument
It seems like this may be one of the only feasible ways for Apollo users to instrument with reasonable overhead, and it should probably be kept separate from the graphql integration here
What version of OpenTelemetry are you using?
1.17.0
What version of Node are you using?
17.9.1
What did you do?
Installed
@opentelemetry/instrumentation-graphql
with default options anddepth: 0
.What did you expect to see?
Minimal impact to latency.
What did you see instead?
While debugging a graphql query using ApolloServer which returns 340KB of data (11,494 keys in the response JSON) I noticed that our business logic was executing just under 50ms while the request was taking ~1.2s. Using the tracing data collected by
@opentelemetry/instrumentation-graphql
I was able to observe that all of our resolvers took ~86ms CPU time but 1.1s wall time. I suspected something was wrong with resolver scheduling so ranclinic doctor -- node server.ts
to gain insight and noticed during the span of the query, the event loop experienced ~250ms of latency. Removing@opentelemetry/instrumentation-graphql
decreased this down to ~20ms of latency and reduced the overall request timing to 187ms.TL;DR even with
depth: 0
, for large payloads@opentelemetry/instrumentation-graphql
increased event loop latency by an order of magnitude.Replication details
While replication of our exact problem would require our server code, a minimal example of problematic and large GraphQL payloads can be found here to test with: https://codesandbox.io/s/apollo-server-performance-issue-i7fk7