Decide what to do with `LogRecord.CategoryName`

alanwest commented 2 years ago

The top-level name field was dropped from the log data model. Originally, .NET's LogRecord.CategoryName was intended to serve the purpose of this top-level field.

CategoryName cannot be removed because it is part of .NET's stable log data model, so we need to decide what to do with it.

Options

1. Only use it for `ILogger`

Each exporter is responsible for how it represents CategoryName. Currently, the OTLP exporter translates this to an attribute named dotnet.ilogger.category

https://github.com/open-telemetry/opentelemetry-dotnet/blob/cf3dfc51f589dc1a172d056cfab604b0710c14f8/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/Implementation/LogRecordExtensions.cs#L78-L86

If we agree to only use CategoryName for the purpose of ILogger then we could keep the OTLP exporter as it is. Though, we should decide if it makes sense to mimic this behavior in other exporters and components like log enrichers.

2. Make use of it across all logging frameworks we support

The Serilog sink makes use of CategoryName

https://github.com/open-telemetry/opentelemetry-dotnet/blob/bfabe9bc26f904860c1e09541f5abbf6febe8bf2/src/OpenTelemetry.Extensions.Serilog/OpenTelemetrySerilogSink.cs#L71-L75

This is problematic from the standpoint of the OTLP exporter. If we want to make use of CategoryName across logging frameworks, then we need to devise a generically named attribute that it gets mapped to.

3. Abandon the use of `CategoryName` from all SDK components

Perhaps the cleanest option is to abandon the use of CategoryName across all logging components. For each logging framework we support, if the framework has some notion of "category", we would map that to an attribute named appropriately for the framework - e.g., ilogger.category, serilog.context, event_source.name.

If we take this approach, it might make sense to mark CategoryName obsolete.

Though, we still would need to decide what to do if an end user wrote their own extension and made use of CategoryName. Should we map it to a generic attribute? e.g., dotnet.otel.category_name

I'm personally leaning towards option 3. There is an open issue asking to reintroduce the name field to the log data model. If this does happen in the future, it might make sense to keep our CategoryName field obsolete and instead introduce a new field.

reyang commented 2 years ago

@alanwest would you elaborate the importance and urgency of this change?

I can imagine with "Event" API effort, we might reconsider if LogLevel should be made optional. Also consider Scopes - that might be specific to ILogger, too.

A simple approach would be - not deprecating anything at this moment, until we have good understanding of what's the final shape. Or maybe this can be marked as "changes that should happen in major version 2 in the future".

alanwest commented 2 years ago

would you elaborate the importance and urgency of this change?

I do not feel strongly that any change is immediately urgent, but I do think the conversation regarding our stance on using CategoryName in other components is relevant to have now.

This originated from a conversation with @CodeBlanch here https://github.com/open-telemetry/opentelemetry-dotnet/pull/3454/files#r927184812. Currently, the Serilog sink is using CategoryName and he questioned whether to also use it for the EventSource integration.

I figure if we decide to further our use of CategoryName it would be more difficult to change or more disruptive to obsolete it later if we thought that was the right thing to do in the future.

A simple approach would be - not deprecating anything at this moment

I think option 1 embodies this spirit. That is, things like CategoryName, Scopes, and EventId may just become understood to be exclusively for use with ILogger. I guess I was less concerned with LogLevel since I suspect it may always map nicely to the data model's notion of severity number/text - which is already optional.

CodeBlanch commented 2 years ago

Just FYI from the perspective of LogEmitter, the only thing required/guaranteed to always be there is Timestamp. Everything else is optional. I did that because I was thinking about the upcoming "Event" API.

If we're going to obsolete CategoryName, we should drop this: https://github.com/open-telemetry/opentelemetry-dotnet/blob/6f2b1a0b3412f2860d065c81b058441977927584/src/OpenTelemetry/Logs/LogRecordData.cs#L103
The default LogLevel should probably be "None" here I don't recall why I made it "Trace", probably a mistake: https://github.com/open-telemetry/opentelemetry-dotnet/blob/6f2b1a0b3412f2860d065c81b058441977927584/src/OpenTelemetry/Logs/LogRecordData.cs#L108

reyang commented 2 years ago

Just FYI from the perspective of LogEmitter, the only thing required/guaranteed to always be there is Timestamp.

💯💯💯💯💯

alanwest commented 2 years ago

The default LogLevel should probably be "None" here I don't recall why I made it "Trace"

Ah yea, None makes sense.

Everything else is optional.

Yes, understood. The optionality of fields is not really related to my question at hand. To clarify, my question is: when we do populate a field - optionality aside - how does it get handled by the components we maintain?

I'm picking specifically on CategoryName as its use has come up in a number of conversations and those conversations are buried in TODOs in code and conversations on merged PRs. I opened this issue to continue these conversation not to express any sort of urgency.

Narrowing the scope of my question, here's the specific "weird" situation we have today:

The OTLP exporter exports CategoryName as the attribute dotnet.ilogger.category. There is a TODO in the code to consider whether this name is good or not.
The new Serilog sink populates the CategoryName from Serilog's notion of SourceContext. Therefore, using the Serilog sink in conjunction with the OTLP exporter results in an attribute name dotnet.ilogger.category.
Lastly, there was a question on the EventSource PR whether or not to follow in the same suit as the Serilog extension and populate CategoryName.

I think the options I've posed are all valid ways to address this situation. There may be other options.

cijothomas commented 2 years ago

I like option1 - CategoryName, Scopes, EventId etc are ILogger specific concepts, and by OpenTelemetry .NET SDK preferentially treating ILogger, they are part of LogRecord. Since they are not a top-level fields in Log Data Model, we need to export them as attributes in OTLP. (The names of the attributes are TBD)

We should not force other things (like Serilog sinks or other LogEmitter usages) to populate the iLogger only stuff.

CodeBlanch commented 2 years ago

This conversation is pulling in a few different directions, I'll try to reply generally.

RE: removing CategoryName.

I think this is good to do on LogRecordData which will effectively remove it from LogEmitter. I don't think it is a good idea to remove it from LogRecord because then where does our ILogger implementation put it? The natural place would be StateValues but hang on that isn't necessarily anything mutable. It is IReadOnlyList! We do have the ability to buffer it, but do we want to force buffering every log message? Probably not 😄 Might just have to live with it.

CategoryName, Scopes, EventId etc are ILogger specific concepts

Agree with CategoryName. Agree with Scopes (they are already NOT available on LogRecordData \ LogEmitter). EventId, not totally sure. There is the new API coming which IIRC has event name (+ id?). Might call for tags, might need special handling, not sure. We could always remove it and put it back if needed. I am populating it in the EventSource shim thing but that could be switched to tags/attributes.

So here would be my plan of attack...

Remove CategoryName + EventId from LogRecordData.
Update Serilog sink to set serilog.context (or whatever) tag instead of CategoryName.
Update EventSource shim to set event.name and/or event.id (or whatever) tags instead of EventId.

cijothomas commented 2 years ago

^ agree with this "attack" plan :)

alanwest commented 2 years ago

Yes, agreed. I think this solves the immediate issues at hand. And big 👍 to the guardrails against using any "ILogger stuff" for non-ILogger things - maybe this is something that's circled back to later, but I think it's good to start with this stance.

cijothomas commented 2 years ago

Somewhat related/similar is - OTel .NET decided to use Activity from runtime, instead of own class. But in our exporters, we try to only export what is required by OTel spec. Activity class has probably more things, which we just ignore in our exporters. Like Activity.CustomProperty?, Baggage? are ignored by this repo completely.

CodeBlanch commented 2 years ago

I was on the log SIG this week. There was some discussion about what Java does. Apparently what the Processor gets is a read/write interface to the data (span, log, whatever). Then the exporter gets a read-only interface. Kind of interesting. Not super helpful for us because we can't add interfaces to Activity 😄 We would have to wrap it which would be another allocation (or a boxing op).

reyang commented 2 years ago

I was on the log SIG this week. There was some discussion about what Java does. Apparently what the Processor gets is a read/write interface to the data (span, log, whatever). Then the exporter gets a read-only interface. Kind of interesting. Not super helpful for us because we can't add interfaces to Activity 😄 We would have to wrap it which would be another allocation (or a boxing op).

Do you see benefit of having such difference besides preventing accidental modification from the exporter?

CodeBlanch commented 2 years ago

Do you see benefit of having such difference besides preventing accidental modification from the exporter?

We have seen users use reflection to get around things being read-only so my currently feeling is: Just make everything mutable and let the users decide if they want to follow the spec guidance or not 😄

cijothomas commented 2 years ago

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md#field-instrumentationscope As per this guidance, the logger name (java equivalent : https://docs.oracle.com/javase/7/docs/api/java/util/logging/Logger.html#getLogger(java.lang.String) should be part of the instrumentation scope.

cijothomas commented 1 year ago

Removing from 1.6 milestone, as it was decided to remove CategoryName completely to unblock 1.6 stable release. The issue is still open and importantly, the category name is not even going to be exported in 1.6 stable. This will be brought back in 1.7.* prerleases, probably under experimental feature, as it is not yet solidified what to do with category/exception/eventid,name.

nblumhardt commented 9 months ago

Hi folks! I think this issue is stale - category names were routed into InstrumentationScope.Name in #4941.

cijothomas commented 2 months ago

@vishweshbankwar This can be closed right? CategoryName is exported as InstrumentationScope in OTLP Exporters now.

vishweshbankwar commented 2 months ago

This was fixed in 1.7.0

open-telemetry / opentelemetry-dotnet