Closed codefromthecrypt closed 2 months ago
Pinging code owners:
processor/transform: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley
See Adding Labels via Comments if you do not have permissions to add labels yourself.
@codefromthecrypt I am very out of the loop here.
Contexts defined in pkg/ottl/contexts
all relate to a structure in pdata which is our representation of OTLP protos. I am not familiar with log event as a structure in the proto.
Are you referring to the Logs Event API? I thought the plan was for events to be represented as LogRecords. Is there going to be a new structure in pdata for us to create a context?
I'll add that OTTL currently has no way to do cross-signal transformations. It is possible a context could be crafted that has references to different data types, but there are likely some problems that would need to be solved.
@TylerHelmuth so I think many of your notes are hitting nails. Yes, this is the Logs Event API I corrected my PR desc.
It sounds first that cross signals is indeed a prereq to the use case I mention (span event -> log event). So, this may rule out ottl as a solution for this until/unless there's scope for that.
The Logs Event API is not often put to practice, yet. Indeed, the spec says it manifests as LogRecords, where the body is AnyValue and attributes must also include "event.name".
So, this is why I mentioned perhaps since this is an opinionated layer, perhaps a context that narrows to these could be helpful.
The other point is at least in the pending change to LLM, there are structured definitions, and leaves have types to them. For example, some fields in the event body (type AnyValue) are string where others are AnyValue or something else. To do something like access an ID described in the body, requires some knowledge of the types which are in the semantic conventions. So, I wonder if such a binding would make sense, or too far out for this change.
So, this is why I mentioned perhaps since this is an opinionated layer, perhaps a context that narrows to these could be helpful.
This is probably possible, but I wonder what that would look like. If the pdata LogRecord
does not contain an attribute named event.name
and is passed into ottllogevent.NewTransformContext
should it error? Or would it be better to use the ottllog
TransformContext and build your own wrapper that enforces the event API requirements?
At the moment, a ottllogevent path parser would behave exactly the same as the ottllog parser, because an otel event is just an otel log in the eyes of pdata.
To do something like access an ID described in the body, requires some knowledge of the types which are in the semantic conventions
I could see how this is the differentiator. If there are defined, enforceable body structures, then an ottllogevent path parser could enforce "you can only index the body if the event.name attribute is in some list of approved event names".
However, OTTL syntax doesn't expose that such a statement is allowed. A user, why writing a statement, would still need to understand their data to know if it is safe to write such a statement.
interesting points. I think maybe we should use ottllog
and then revisit this to see if there is really any value to be gained by an event facade.
For the genai use case of log events -> span events, we can't use ottl
anyway due to lack of cross signal. So, I won't likely be able to help contribute motivating ottl in the near term (directly or indirectly).
Should I close and re-open this with more info, or leave it open?
I think since we don't currently have any plans to move forward with this idea, closing is best.
Component(s)
processor/transform
Is your feature request related to a problem? Please describe.
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33737 introduces the Blob Attribute Uploader Connector. One driver of that discussion was LLM (GenAI) semantics, which were initially written to have prompt/completion data recorded as span events. With this tool, a backend which uses span attributes for the same data, could convert accordingly.
A proof of concept to showcase this, included span and span event contexts.
The use case in question, LLM (GenAI) semantics, is on a path to change again, now from span events to log events, using the Logs Event API.
I asked if the uploader would adjust and was told to make new issues for these pending changes.
Along the way, I noticed that while there is an
ottllog
context, there is no ottllogevent context. This made me curious whether we intend to heuristically handle log events, or if there should be another context, which for example exposes the attributeevent.name
and a body where present isAnyValue
.Describe the solution you'd like
I would like to see an
ottlogevent
context, so that we can continue the work begun in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33737 towards the pending reality of switching LLM (GenAI) prompt/completion events from span to logs.Especially as there are concerns such as how to map semantic conventions (possibly weaver) to the body
AnyValue
, I believe this should be a separate package fromottlog
even if there are shared code.Describe alternatives you've considered
One way is to make a custom transform processor which doesn't use ottl.
Additional context
the ultimate goal is to help implementors who have to deal with mixed mode, where some instrumentation emit span events and others log events, or where the backend, or UI cannot yet support log events.