open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.73k stars 888 forks source link

Ideas on how to InstrumentationLibrary -> Scope #2307

Closed bogdandrutu closed 2 years ago

bogdandrutu commented 2 years ago

This is a try to document what is the ideal scenario to extend the protocol to support the notion of "instrumentation scope", and allow the tracer/meter/logger to share the same concepts.

There were multiple proposals, disagreements, ideas in lots of places:

OTLP ideal end goal, based on lots of discussions/comments:

// Resource information.
message Resource {
  // Set of labels that describe the resource.
  repeated opentelemetry.proto.common.v1.AttributeKeyValue attributes = 1;
}

// Scope is a message representing the instrumentation scope
message Scope {
  // TBD: Do we need a first class "name" concept?
  // Set of labels that describe the instrumentation scope.
  repeated opentelemetry.proto.common.v1.AttributeKeyValue attributes = 1;
}

// A collection of ScopeMetrics from a Resource.
message ResourceMetrics {
  // The Resource for the metrics in this message.
  // If this field is not set then no Resource is known.
  Resource resource = 1;
  // A list of metrics that originate from a Resource and their scope.
  repeated ScopeMetrics scope_metrics = 2;
}

// A collection of Metrics from a Scope within a Resource.
message ScopeMetrics {
  // The Instrumentation scope information for the metrics in this message.
  // If this field is not set then no Scope is known.
  Scope scope = 1;

  // A list of metrics that originate from an Instrumentation scope.
  repeated Metric metrics = 2;
}

How to get there?

There are few places where changes are necessary:

  1. The OTLP protocol can be changed following a "backwards" compatible approach:
    • Rename "InstrumentationLibrary" to "Scope".
    • Add "attributes" to the new Scope concept.
    • Deprecate the current "name"/"version" in favor of semantic conventions for these fields in the new "attributes".
  2. The current "TracerProvider.get()" API:
    • Probably it is fine to relax the need to have this as defined right now "instrumentation library name" and suggest that as a "MAY".
    • When doing this change probably the SDK should stop populating the old "instrumentation library name" field in the protocol, and start using the new semantic convention (or proposed solution) for "scope.name". This will cause current receivers to receive an empty instrumentation library name which is ok, since the "instrumentation" itself is not stable.
    • Alternative we can simply continue to populate, the change of this "meaning" cannot cause that much inconvenience to the receivers. Not sure what receivers can do with this value, since is more or less a semantic convention anyway in regard to the format of the name and uniqueness, etc.
  3. The SDK (InstrumentationLibrary struct/class)
    • Ideally, if possible to rename/deprecate and add the new concept. This is probably very language specific. Need to collect more feedback from the maintainers.
tigrannajaryan commented 2 years ago

Add "attributes" to the new Scope concept.

This goes a step further than was discussed in the last spec SIG meeting. A smaller step would be to only add a "subscope_name" as an additional field to "instrumentation_library_name/version" and call the message with these 3 fields a Scope. But perhaps "attributes" field+semantic conventions for scopes is a better approach if we can come with other use-cases when recording additional attributes is necessary.

Probably it is fine to relax the need to have this as defined right now "instrumentation library name" and suggest that as a "MAY".

I think this one people disagree with because this dilutes the semantics of the existing API. I tend to agree with this position. I think we need to keep existing TracerProvider.get() API's semantics unchanged since it is declared stable. It needs to populate instrumentation library name and version.

We can add something like TracerProvider.getScoped() which will accept the new set of parameters (either libname/ver/subscope or just "attributes").

jmacd commented 2 years ago

I like this proposal. It solves the problem posed in https://github.com/open-telemetry/oteps/issues/78.

bogdandrutu commented 2 years ago

@jmacd I wish I would have understood better your proposal back then, anyway good time now to fix that.

mwear commented 2 years ago

I generally like this approach, as a scope would give us a place to store instrumentation library as well as additional metadata. There was a lengthy discussion at the spec SIG where concerns were raised around backwards compatibility, and ways to work around incompatibilities were suggested. Is there any chance we can document what the concerns were and how they could be resolved?

dyladan commented 2 years ago

I can document at least my concern:

I am concerned that expanding what the name is allowed to be will lose the information of which library actually generated a log line. Loggers can easily be named "http" when they are monitoring HTTP requests (rather than the "@opentelemetry/instrumentation-http" that we have for traces and metrics). If the same concept of "scope" is used with the logging, tracing, and metrics APIs this misuse will inevitably leak into the trace and metric API use. IIRC @bogdandrutu's proposed resolution (and please bogdan, expand on this if I am misstating) was that the scope would typically be the fully qualified class name of the instrumentation class anyway, because that is how he has seen logger name use "in the wild" and he doesn't expect it to be a problem. Instrumentation libraries would use their own fully qualified class names and first party instrumentations would use their own fully qualified class names. On the call today he seemed to believe that calling the field "instrumentation library name" was confusing to users, particularly in the case where the instrumentation library and the instrumented library are the same (first party instrumentation).

I, (and I believe @Aneurysm9 although I would let him speak for himself) am not completely convinced by this. At least in JavaScript I have not seen any consistent use of logger name, and certainly would not expect fully qualified class name to be commonly used as a logger name (JavaScript doesn't even have a common idiom for globally referencing a class anyway). Some popular JS logging libraries (winston, npmlog) don't even have the concept of a logger name, and those that do offer little to no guidance as to what a logger name should be (bunyan, pino)

Aneurysm9 commented 2 years ago

I do share Dan's concern that assumptions about user behavior are being made without adequate evidentiary support. I also think that, even if those assumptions are correct, this solution is overly broad and does more than it needs to do.

It seems that the current concern is that InstrumentationLibrary as a concept does not fit some expected/traditional uses because "Library" is too large a scope for what some developers would expect to use to identify the logger they use. If a narrower scope is expected then I do not see why we need to completely upend the structure and semantics of the InstrumentationLibrary data in OTLP or the semantics of the existing APIs.

It is proposed that:

There are few places where changes are necessary:

The OTLP protocol can be changed following a "backwards" compatible approach:

  • Rename "InstrumentationLibrary" to "Scope".
  • Add "attributes" to the new Scope concept.
  • Deprecate the current "name"/"version" in favor of semantic conventions for these fields in the new "attributes".

With some further discussion of what this means for the existing APIs and SDKs.

Instead, I think there is a single, simple change required:

Doing this, coupled with defining semantic conventions for those attributes, would allow for further narrowing the scope of the InstrumentationLibrary to the point where it covered a single class or method or however users wished to use the flexibility it affords. This requires no changes to existing APIs or SDKs and does not break the expectations of existing users. It adds new control surfaces for users who want them and requires nothing of those that don't.

dyladan commented 2 years ago

Instead, I think there is a single, simple change required:

  • Add attributes to the InstrumentationLibrary

Doing this, coupled with defining semantic conventions for those attributes, would allow for further narrowing the scope of the InstrumentationLibrary to the point where it covered a single class or method or however users wished to use the flexibility it affords. This requires no changes to existing APIs or SDKs and does not break the expectations of existing users. It adds new control surfaces for users who want them and requires nothing of those that don't.

Last week I thought we had agreed on a variation of this where scope was going to be added as an additional field without removing the existing instr lib name/version fields. That was fine with me and this version proposed by @Aneurysm9 is equally acceptable to me.

Aneurysm9 commented 2 years ago

Adding attributes to InstrumentationLibrary was what I had in mind last week. Perhaps I was not clear about my thoughts if the expectation was that only a single new scalar field would be added. That would seem to be a stop-gap to me and likely to lead to further requests for additional scoping fields. Using a set of attributes moves that out of the data model/API realm and into the semantic conventions where we have at least the beginnings of a mechanism for schema evolution that would make changes potentially less impactful.

jmacd commented 2 years ago

Using a set of attributes moves that out of the data model/API realm and into the semantic conventions where we have at least the beginnings

Right. I think we're imagining that instrumentation library name would be deprecated in favor of instrumentation.name, instrumentation.version, and instrumentation.schema_url attributes.

As for the standard logger name concept we're referring to, it seems to be a kind of component namespace. Maybe it's component.name. For a gRPC instrumentation library, maybe, you'd have instrumentation.name=google/something/..../grpc for the whole library and within the library you'd have loggers named for the big moving pieces of the implementation, like "channels" and "clients" and "loadbalancer", etc.

Some popular JS logging libraries (winston, npmlog) don't even have the concept of a logger name, and those that do offer little to no guidance as to what a logger name should be (bunyan, pino)

I think this deserves emphasis, fwiw. Are we being led by a Java-specific concern? When I look at the origin of java.util.logging.Logger.Name, it comes back again that the central purpose is to have a handle by which to disable telemetry. The whole point of these names, IMO, is to say turn on verbose logging for an expression like google/something/..../grpc#loadbalancer

anuraaga commented 2 years ago

Adding a library name such as opentelemetry-log4j-appender into InstrumentationLibrary.name seems like it can help troubleshoot issues in some cases, similar to things like opentelemetry-servlet we have on the tracing side. It would be very inconsistent if we didn't populate the same information for logs I think.

At the same time, in languages where using named loggers is common, we do expect to have many many log messages associated with a single logger name, and the name tends to be a very important aspect of triaging logs since it often is closely mapped to a component of concern, so it does seem useful to group the log messages by it within the protocol, for compression reasons if nothing else. Adding attributes to InstrumentationLibrary seems like a reasonable way to continue to keep InstrumentationLibrary.name meaning the actual library and having that nesting for logger name available. Alternatively, keeping it as an attribute on the log records themselves and expecting gzip to help enough in situations where space matters also seems fine.

Thinking a bit more on servlet, I think the equivalent to logger name there would be the name of the class that implements HttpServlet. This is currently a span attribute, so there is precedence to just let logger name be a record attribute. If pushing logger name up in some form, I guess we would expect to do something similar for a servlet name, for example.

bogdandrutu commented 2 years ago

For @jmacd

instrumentation.schema_url

schema_url is different, it identifies the semantic conventions, so will not be a semantic convention :)

For @dyladan

Last week I thought we had agreed on a variation of this where scope was going to be added as an additional field without removing the existing instr lib name/version fields.

There are couple of problems that we have to solve:

  1. The representation on the wire. Which we probably will end up having a set of attributes and maybe (we can discuss that) some first class things like name.
  2. What to do with the current "get(string)" API which in my opinion is the biggest part of the discussion, since it matches a lot with what lots of logging library have as an API to get a Logger.

For the "wire format" where we can rename messages and fields, we can have something like (based on the current documentation in the proto we can probably simply not doing this at all, see what is documented https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/common/v1/common.proto#L77):

// Scope is a message representing the instrumentation scope
message Scope {
  // An empty instrumentation library name means the name is unknown. 
  string **instrumentation_name** = 1; // deprecated, will be available as a semantic convention.
  string **instrumentation_version** = 2; // deprecated, will be available as a semantic convention.

  string name = 3;  // The new relaxed "name". TBD if a special field or also a semantic convention. For me it seems important to be first class citizen.

  // Set of labels that describe the instrumentation scope.
  repeated opentelemetry.proto.common.v1.AttributeKeyValue attributes = 4;
}

// A collection of Metrics from a Scope within a Resource.
message ScopeMetrics {
  // The Instrumentation scope information for the metrics in this message.
  // If this field is not set then no Scope is known.
  Scope scope = 1;

  // A list of metrics that originate from an Instrumentation scope.
  repeated Metric metrics = 2;
}

For the "get(string)" API, here is what is documented:

name (required): This name must identify the instrumentation library (e.g. io.opentelemetry.contrib.mongodb). If an application or library has built-in OpenTelemetry instrumentation, both Instrumented library and Instrumentation library may refer to the same library. In that scenario, the name denotes a module name or component name within that library or application. In case an invalid name (null or empty string) is specified, a working Tracer implementation MUST be returned as a fallback rather than returning null or throwing an exception, its name property SHOULD be set to an empty string, and a message reporting that the specified value is invalid SHOULD be logged. A library, implementing the OpenTelemetry API may also ignore this name and return a default instance for all calls, if it does not support "named" functionality (e.g. an implementation which is not even observability-related). A TracerProvider could also return a no-op Tracer here if application owners configure the SDK to suppress telemetry produced by this library.

I can easily make a case that ignoring the name at all is a valid case A library, implementing the OpenTelemetry API may also ignore this name and return a default instance for all calls, if it does not support "named" functionality, I can make another case that This name must identify the instrumentation library this is a very light requirement, I can use a class name or any string I want since it "identifies" the "instrumentation library".

If an application or library has built-in OpenTelemetry instrumentation, both Instrumented library and Instrumentation library may refer to the same library. In that scenario, the name denotes a module name or component name within that library or application.

Here it clearly allows me to use anything I want "module/class/component". I am not sure we need to debate anything based on this comment. I can clearly make a case that this does not need to be "unique" (which is not documented at all in the current text).

Besides all these glitches in the documentation that allow us to mostly change anything we want in terms of the semantics of that field, the proposal I think is to actually consolidate on the important requirement (ensure uniqueness which is not documented) and relax on the "artificial" requirement (to represent instrumentation library).

I think the important changes here are:

dyladan commented 2 years ago

@Aneurysm9

Adding attributes to InstrumentationLibrary was what I had in mind last week. Perhaps I was not clear about my thoughts if the expectation was that only a single new scalar field would be added. That would seem to be a stop-gap to me and likely to lead to further requests for additional scoping fields. Using a set of attributes moves that out of the data model/API realm and into the semantic conventions where we have at least the beginnings of a mechanism for schema evolution that would make changes potentially less impactful.

Set of attributes is totally fine with me as long as the information isn't lost

@jmacd

Right. I think we're imagining that instrumentation library name would be deprecated in favor of instrumentation.name, instrumentation.version, and instrumentation.schema_url attributes.

Seems reasonable % bogdan's comment about schema url

Some popular JS logging libraries (winston, npmlog) don't even have the concept of a logger name, and those that do offer little to no guidance as to what a logger name should be (bunyan, pino)

I think this deserves emphasis, fwiw. Are we being led by a Java-specific concern? When I look at the origin of java.util.logging.Logger.Name, it comes back again that the central purpose is to have a handle by which to disable telemetry. The whole point of these names, IMO, is to say turn on verbose logging for an expression like google/something/..../grpc#loadbalancer

I don't think this is a Java specific concern. The idea of a logger name is definitely not universal, but certainly I see the usefulness of it and if we're designing a logging API I think it is a worthy inclusion. Some assumptions made about the solution may be specific to Java.

@bogdandrutu

There are couple of problems that we have to solve:

  1. The representation on the wire. Which we probably will end up having a set of attributes and maybe (we can discuss that) some first class things like name.

Seems fine.

  1. What to do with the current "get(string)" API which in my opinion is the biggest part of the discussion, since it matches a lot with what lots of logging library have as an API to get a Logger.

[elided...]

Here it clearly allows me to use anything I want "module/class/component". I am not sure we need to debate anything based on this comment. I can clearly make a case that this does not need to be "unique" (which is not documented at all in the current text).

This seems like a creative interpretation to me. Sure, maybe to the letter of the rules it can be done, but removing the uniqueness and meaning from it removes a lot of its usefulness. I believe @Aneurysm9 would agree with me here, but again I'd let him speak for himself.

Besides all these glitches in the documentation that allow us to mostly change anything we want in terms of the semantics of that field, the proposal I think is to actually consolidate on the important requirement (ensure uniqueness which is not documented) and relax on the "artificial" requirement (to represent instrumentation library).

I think the important changes here are:

  • Drop the requirement to be the "instrumentation library"
  • Add a requirement to uniquely identify that "instrumentation class/component/library/scope".

Uniqueness is not documented because it wasn't required within a single library. Within a library you can have 3 tracers all with the same name. It just needs to not be the same as the name from any other library. Obviously global uniqueness also satisfies this requirement.

Maybe a compromise is to say the name should be globally unique, and one suggested way to do that is to use the fully qualified class name (or similar fully qualified identifier)?

side note: github issues need threads 😆

dyladan commented 2 years ago

I would like to challenge the assumption that we can't have getLogger and getTracer with different signatures. If logger name is a well-established concept that's all well and good, but tracer and meter name are not likely to be familiar concepts to new users and already will require some education. I don't think we can just assume users will not be able to understand the differences here.

bogdandrutu commented 2 years ago

@dyladan

Maybe a compromise is to say the name should be globally unique, and one suggested way to do that is to use the fully qualified class name (or similar fully qualified identifier)?

+1

Uniqueness is not documented because it wasn't required within a single library. Within a library you can have 3 tracers all with the same name. It just needs to not be the same as the name from any other library. Obviously global uniqueness also satisfies this requirement.

Not sure about where you see "inside a single library" and where that scope comes from. I am saying that as it is right now I can use "io.grpc" as identifier for my "instrumentation library", it does not say to be unique.

@dyladan here is an offer that you cannot refuse :))

Here is a more crazy idea (inspired by also the fact that right now the "name", can be In that scenario, the name denotes a module name or component name within that library or application., I want to propose a simple "sed/InstrumentationLibrary/Instrumentation" and we are done :) And this is a serious proposal, I think the "library" part is the one that creates the most problems, and is violated even by our documentation.

bogdandrutu commented 2 years ago

@dyladan

Even in your example fully qualified class name (or similar fully qualified identifier) the class name is not a "library". I think that word is the one that causes the most troubles and if simply remove that we can get the best of both worlds. Also it fixes our own specification since we recommend "a module name or component name within that library or application" which are not libraries.

dyladan commented 2 years ago

@dyladan

Maybe a compromise is to say the name should be globally unique, and one suggested way to do that is to use the fully qualified class name (or similar fully qualified identifier)?

+1

/cc @z1c0 @arminru can we verify if this is ok for us?

Uniqueness is not documented because it wasn't required within a single library. Within a library you can have 3 tracers all with the same name. It just needs to not be the same as the name from any other library. Obviously global uniqueness also satisfies this requirement.

Not sure about where you see "inside a single library" and where that scope comes from. I am saying that as it is right now I can use "io.grpc" as identifier for my "instrumentation library", it does not say to be unique.

I was just saying that there is no uniqueness. Within the same library the name would of course always be the same and of course is not unique, but it is unique to that library. 2 different library authors would never clash names if the library name is used. This is the type of "uniqueness" that is important.

@dyladan here is an offer that you cannot refuse :))

Here is a more crazy idea (inspired by also the fact that right now the "name", can be In that scenario, the name denotes a module name or component name within that library or application., I want to propose a simple "sed/InstrumentationLibrary/Instrumentation" and we are done :) And this is a serious proposal, I think the "library" part is the one that creates the most problems, and is violated even by our documentation.

Even in your example fully qualified class name (or similar fully qualified identifier) the class name is not a "library". I think that word is the one that causes the most troubles and if simply remove that we can get the best of both worlds. Also it fixes our own specification since we recommend "a module name or component name within that library or application" which are not libraries.

I think relaxing the term "library" is fine as long as we say we need something fully qualified. I don't want the case where the name is just the name of the class (not fully qualified) and clashes with a different library that happens to have a class with the same name.

You're right that the "library" term isn't important to us, as long as we can uniquely identify the component which is generating the telemetry.

dyladan commented 2 years ago

/cc @arminru @z1c0 @Oberon00 to make sure i didn't make any wrong claims in the above

bogdandrutu commented 2 years ago

100% I want to preserve the ability to isolate an "instrumentation" (being it class/package/library/etc) using that name, same requirements were on the bases of logs as well :)

bogdandrutu commented 2 years ago

@dyladan

2 different library authors would never clash names if the library name is used. This is the type of "uniqueness" that is important.

This is what I am trying to say that is not documented right now.

dyladan commented 2 years ago

It's not documented it's just a natural consequence of using instrumentation library name

Oberon00 commented 2 years ago

Probably it is fine to relax the need to have this as defined right now "instrumentation library name" and suggest that as a "MAY".

I think we should keep this out of this issue and defer to #586.

Maybe a compromise is to say the name should be globally unique, and one suggested way to do that is to use the fully qualified class name (or similar fully qualified identifier)?

Also not sure if this should be discussed separately from the ability to add arbitrary attributes, which I think is the main topic of this issue(?). But let's try discussing it here:

It is true that the spec right now wants you to have the actual library/artifact name. But I think this was never really followed consistently across languages (e.g. Python uses the module/file name __name__ of the module creating the tracer in many cases, which is different from the distribution/library name, correct me if I'm wrong @open-telemetry/python-maintainers).

I think the actual source / format of the name is not that important, as long as we agree on what it should be useful for, which is IMHO, identifying the technical source of some telemetry for troubleshooting telemetry-related code, usually instrumentation code (not for assigning it to some logical monitored system component -- other span and resource attributes should be used for that like service.name, http.route, code.function, etc.). Ideally, it is not too fine-grained, or if it is fine grained, I think it would make sense to recommend having a common prefix. This would allow discarding / skipping analysis of telemetry data generated by known-faulty/too noisy sources.

ocelotl commented 2 years ago

It is true that the spec right now wants you to have the actual library/artifact name. But I think this was never really followed consistently across languages (e.g. Python uses the module/file name __name__ of the module creating the tracer in many cases, which is different from the distribution/library name, correct me if I'm wrong @open-telemetry/python-maintainers).

Do you mean this, @Oberon00?

If that is the case, well, that is just a value the user can pass to a method that is part of the public API. In that case, I think the spec does not "apply" since it is left as a choice for the user.

Oberon00 commented 2 years ago

In that case, I think the spec does not "apply" since it is left as a choice for the user.

I mean, the spec can't force users to do anything, they can also write a new GUID into the span name for each span if they want, but that would break everybody who expects that the span name is what the spec says it should be.

But in that case, I think __name__ ought to be fine as "instrumentation library name".

ocelotl commented 2 years ago

I mean, the spec can't force users to do anything, they can also write a new GUID into the span name for each span if they want, but that would break everybody who expects that the span name is what the spec says it should be.

If we want the span name to be defined by the spec, shouldn't we add to the spec a requirement that is to be implemented by a non-user defined field somewhere?

Aneurysm9 commented 2 years ago

If we want the span name to be defined by the spec, shouldn't we add to the spec a requirement that is to be implemented by a non-user defined field somewhere?

We don't want the span name to be defined by the spec, we want the meaning of the span name to be defined by the spec. The same goes for the instrumentation library name. Having a defined meaning should lead to consistency in usage and the ability to reliably make decisions based on the information it conveys rather than treating it simply as bits of data with no informational value.

tigrannajaryan commented 2 years ago

Please review https://github.com/open-telemetry/opentelemetry-specification/pull/2276