`event.namespace` instrumentation scope attribute instead of `event.name` prefixes

pellared commented 1 week ago

I think that instrumentation libraries emitting events could add a event.namespace instrumentation scope attribute instead of putting the namespace to each event.name value as a prefix.

Remarks:

This should decrease the size of exported log payloads
Separate loggers for emitting events and log records. Clearer intend of the Logger usage. It also makes it possible to distinguish "events" emitted via bridged libraires by application devs, and OTel Logs API by instrumentation libraries, and application devs which creates "business" events using OTel Logs API.
SDK Logger implementation have access to instrumentation scope attributes. The user does not need to pass the event.namespace each time. The namespace would be known for Logger.Enabled even if the caller does not pass event.name. Related issue: https://github.com/open-telemetry/opentelemetry-specification/issues/4220
Each events semantic convention should define its namespace e.g. http, android.

An alternative approach could be to add the event namespace both in event.name log record attribute and event.namespace instrumentation scope attribute. However, I think that the redundancy will increase the performance overhead and complexity of the design.

tedsuo commented 1 week ago

In semconv we use a global namespace, so this feels like a question for that SIG.

tedsuo commented 1 week ago

(but I'm not against the idea, this is the kind of thing I would like us to play with once we have a decent amount of prototyping completed)

tedsuo commented 6 days ago

Another concern – perhaps a blocking one – is that right now LogRecords are self contained. Instrumentation Scope is very helpful, but it's a nice-to-have as far as processing the data on the back end. If critical information needed to fully qualify the event name is only present in the instrumentation scope, things have suddenly gotten more complicated, as the LogRecord is now incomplete and un-processable on its own.

Obviously, important context such as service.name lives outside of the LogRecord. But that's a bit different. You don't know where a span came from, but you do know everything about the operation it is describing. In this case, it would be impossible to discern that information.

cijothomas commented 5 days ago

instead of putting the namespace to each event.name value as a prefix.

Is that already spec-ed out that event.name MUST be fully qualified name?

tedsuo commented 5 days ago

@cijothomas event.name it is experimental we are prototyping right now, so no nothing is a MUST yet. But if events from different domains can get accidentally mixed together because they both have the same name, seems pretty bad.

If we are saying that each event is it's own semantic convention, and for that reason you can expect all of the log records with the same event name to have same attributes, event names would have to be fully qualified. Otherwise we would have to say "each event name plus namespace" must be its own semantic convention. If we determine that having a separate namespace and name makes event processing easier than having a single fully qualified name, I'm not against it. But if that namespace is not actually on the log record, my intuition is that it would create all kinds of problems, and would make event processing harder, not easier. Log records don't contain a pointer to an instrumentation scope. imho, a small reduction in payload size (which can also be reduced by compression or Otel Arrow) doesn't seem like a great reason to make the fully qualified name of a log record unknowable from the log record itself.

tedsuo commented 4 days ago

It also makes the assumption that a library/package/class would only produces events from a single namespace. I'm not sure if that assumption holds. A library might contain semantic conventions from more than one domain. This certainly happens with spans.

cijothomas commented 4 days ago

Isn't InstrumentationScope created to avoid such conflicts? I think what everyone does is (just a metrics example) Counter Name "MyFruitCounter" Meter Name "MyCompany.Product.Division.Service" and not Counter Name "MyCompany.Product.Division.Service.MyFruitCounter" Meter Name "MyCompany.Product.Division.Service"

a small reduction in payload size (which can also be reduced by compression or Otel Arrow)

Payload/Storage size reduction was not the motivation. I was thinking of the use cases where one needs to act based on Event.Name and it'll be faster with smaller (i.e without fully qualified) Event.Name. (Or ideally with Event.Id, a numerical, machine friendly version of the same)

pellared commented 4 days ago

Log records don't contain a pointer to an instrumentation scope

From SDK docs https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/sdk.md#readablelogrecord:

ReadableLogRecord

A function receiving this as an argument MUST be able to access all the information added to the LogRecord. It MUST also be able to access the Instrumentation Scope and Resource information (implicitly) associated with the LogRecord.

pellared commented 4 days ago

It also makes the assumption that a library/package/class would only produces events from a single namespace

It would be still possible to produce events with multiple namespace. For each for each event namespace the caller would have to create distinct loggers (with different instrumentation scope attributes) and use them accordingly. Basically a Logger with event.namespace attribute would make the Logger an "EventLogger" which was coined in Events API (for given namespace).

MSNev commented 2 days ago

instead of putting the namespace to each event.name value as a prefix.

Is that already spec-ed out that event.name MUST be fully qualified name?

Yes, point [1] under the definition states that the event name is subject to the exact same rules are attributes and that they are namespaced to avoid collisions to provide clean separation of the semantics of each event.

The "MUST" portion, however, is part of the General event semantics and after "LOTS" of discussions, this approach was taken (it was previously event.domain and event.name as this combination MUST uniquely identify the event to ensure that when someone reviewing / comparing the contents (if required) they know that the types are the same.

This is listed in the "Recommendations", that the fields of an event are not comparable unless they have the same event.name.

MSNev commented 2 days ago

It also makes the assumption that a library/package/class would only produces events from a single namespace

It would be still possible to produce events with multiple namespace. For each for each event namespace the caller would have to create distinct loggers (with different instrumentation scope attributes) and use them accordingly. Basically a Logger with event.namespace attribute would make the Logger an "EventLogger" which was coined in Events API (for given namespace).

This is the exact issue that eventually caused use to revert back to a single name, as we prototyped having "getEventLogger(< domain/namespace >)" and it made the usage extremely painful!

tedsuo commented 3 hours ago

From SDK docs https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/sdk.md#readablelogrecord

Yes but I'm referring to backends that have to process this data, not the SDK. If you look at OTLP, there is no pointer from a LogRecord to an InstrumentationScope. That does not feel like a data model that is set up very well to require the InstrumentationScope in order to fully qualify a LogRecord.

It also makes the assumption that a library/package/class would only produces events from a single namespace

It would be still possible to produce events with multiple namespace. For each for each event namespace the caller would have to create distinct loggers (with different instrumentation scope attributes) and use them accordingly. Basically a Logger with event.namespace attribute would make the Logger an "EventLogger" which was coined in Events API (for given namespace).

That is definitely workable, but also feels very burdensome from a UX perspective. You can look at HTTP instrumentation where multiple domains are required, such as http and network. That's one of the reasons we make the namespace for semantic conventions globally qualified. Otel already has a reputation for being unwieldy compared to other instrumentation APIs, asking users to juggle multiple loggers within the same package doesn't feel like a great idea. I would need to see the necessity of this in order to support it.

tedsuo commented 3 hours ago

btw, I don't see anything wrong with language implementations adding convenience methods to the LoggingProvider to produce loggers with some attributes pre-filled, if that is helpful to users. It's making the use of multiple loggers a requirement, and making the LogRecord literally useless without access to the InstrumentationScope, that I am opposed to.

open-telemetry / opentelemetry-specification

`event.namespace` instrumentation scope attribute instead of `event.name` prefixes #4239

ReadableLogRecord