open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.75k stars 890 forks source link

Add semantic convention for describing graphql spans #1670

Closed svrnm closed 2 years ago

svrnm commented 3 years ago

What are you trying to achieve?

GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. [...] GraphQL servers are available for multiple languages, including Haskell, JavaScript, Perl, Python, Ruby, Java, C++, C#, Scala, Go, Rust, Elixir, Erlang, PHP, R, D and Clojure. (Source: Wikipedia)

There are already graphql instrumentation libraries for Node.JS and Ruby (https://opentelemetry.io/registry/?s=graphql) and as quoted above there could be many more. GraphQL has a specification that could be used to define span attributes. For example the Node.JS instrumentation defines the following:

export enum SpanAttributes {
  COMPONENT = 'graphql',
  SOURCE = 'graphql.source',
  FIELD_NAME = 'graphql.field.name',
  FIELD_PATH = 'graphql.field.path',
  FIELD_TYPE = 'graphql.field.type',
  OPERATION = 'graphql.operation.name',
  VARIABLES = 'graphql.variables.',
  ERROR_VALIDATION_NAME = 'graphql.validation.error',
}

The most important ones would be the operation name (query, mutatation, ...) and if available the query name.

ericmustin commented 2 years ago

Was there any reason this stalled? The stronger we can make the semantic conventions around instrumentation attributes, the better, especially for these details which are already implemented across multiple languages?

spencerwilson commented 2 years ago

Just want to point to these two issues folks raised in JS that involve GraphQL semconv:

I don’t see any acute reason this should be blocked, so I assume it’s stalled simply for lack of someone making a PR.

jord1e commented 2 years ago

I can look into contributing a PR this weekend, some prior art to be looked at is the Apollo schema reporting Protobuf file (https://github.com/apollographql/apollo-server/blob/e468367d52e11f3127597e4fe920eb8294538289/packages/apollo-reporting-protobuf/src/reports.proto)

And of course the standing opentelemetry-js(-contrib) and opentelemetry-ruby conventions

Some input from the GraphQL Working Group may also be useful (do companies use Apollo Query Ids, should they be added as a Span attribute?, should we add the query name as an attribute?)

spencerwilson commented 2 years ago

Something I'm unsure whether should be in scope of GraphQL semantic conventions, but suspect should be: For a GraphQL server, there are many distinct procedures that might deserve getting their own span, depending on one's telemetry requirements. Some of them include:

I've not reviewed all existing GraphQL OTel instrumentations to see which of these procedures get their own span currently. The only one I've reviewed is the JavaScript instrumentation (repo), in which the following procedures get spans (src):

My motivation to go into this detail here is that as a user of the graphql-js auto instrumentation, I've at times found myself wishing that I had spans for ExecuteField, i.e., something representing ResolveFieldValue (individual resolvers running) + CompleteValue (encompassing all recursive resolution). (note: The viability of instrumenting some of these procedures might differ depending on the language and how instrumentations are implemented in each language. E.g., I've considered how the JS instrumentation might be changed to achieve what I want, but if what I've considered works at all it might have some costs that outweigh its telemetry-adding benefit.)

The relevance to semconv, minimally, is: How should these different procedures be indicated on the Span data model? My intuition suggests they should be codified at least as some span attribute (if not also a component of span name), but I haven't read all the existing semconv to know if that's best—or if there's an obvious right answer that already exists, for that matter.


Also want to mention the following which might be helpful references, if we're ever seeking definitions for things, see:

tigrannajaryan commented 2 years ago

Done in https://github.com/open-telemetry/opentelemetry-specification/pull/2456