open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.76k stars 890 forks source link

Event API calls should loop back to log frameworks without cycles #4189

Open jack-berg opened 3 months ago

jack-berg commented 3 months ago

Currently, there are two sources of data for the log SDK:

  1. Log appenders for log4j, logback, etc bridge log records recorded via their respective APIs:
log framework -> log bridge API -> log SDK

These logs are probably already logged to console, files or some other local location. Most users are configuring the log bridge to export logs to a network location via OTLP in addition to their existing local logging. Fair enough. Makes sense.

  1. Event API records data directly from instrumentation.
instrumentation -> event API -> event SDK -> log SDK

As of today, its only really easy to send these log records to a network location via OTLP. But people aren't going to want to use the event API for instrumentation if there isn't a good story making that data available to the user via local logs. And given that the log SDK is explicitly designed to not try to reinvent the wheel of existing log frameworks, this means that we need a way for log records recorded via the event API to be bridged back to existing log frameworks.

If a log appender is the name we give to a bridge from a log API / framework to the opentelemetry log bridge, what do we call the thing that goes the other way?

How do we avoid loops? If a user has configured logback with the logback appender to bridge into opentelemetry log bridge, and we configure a bridge for event API log records to be bridged back to logback, we need some sort of marker to avoid infinite loops:

event API -> event SDK -> log SDK -> logback bridge -> logback framework

logback framework -> logback appender -> log bridge API -> log SDK -> logback bridge -> logback framework (cycle)

I brought this up a while back here and can't find any issue with folks talking about it.

mtwo commented 3 months ago

Triage notes: seems reasonable, though we'd like community feedback about whether people need / want to send events back to logging frameworks like Log4J, in addition to their logging destinations.

jack-berg commented 3 months ago

We say:

If we don't make it easy for event API logs to participate in the rich existing ecosystem, then events are not just a particular kind of log. Instead, they are a more limited type of log which only has good tooling to export to network locations via OTLP.

I don't think libraries will want to use the event API for instrumentation if we don't have good tooling for the data to end up in their user's existing log ecosystem. Why would a library adopt the event API instead of slf4j?

codefromthecrypt commented 3 months ago

@jack-berg thanks for raising this as it came out of some of my feedback. I think one reason this hasn't bit people, is lack of adoption of the event API, yet.

I agree what we are talking about is pretty guaranteed, and solving it now vs waiting for more folks to run into it are choices. I suspect there are some prior art even in existing log libraries about this, in cases where they can accidentally create cycles somehow. Have you seen anything?

It would be annoying for sure, if a user attempted to use their local logback bridge then needed to wait for a specification to be written and implemented before getting that to work as expected. def appreciate you thinking ahead.

lmolkova commented 3 months ago

What if events could be expressed through logging API?

Instead of looping events though log facade should we instead have an API convention to actually report events through logging facades (where possible)?

E.g. with slf4j I can write something like

logger.atInfo()
 .addKeyValue("event.name", "com.foo.my-event-id")
 .addKeyValue("otel.log.body", myEventBody) // `otel.log.body` property is used as log record body by bridge api
 .log("something important") // see https://github.com/open-telemetry/semantic-conventions/issues/1076

Or I can imagine


class MyEvent implements io.otel.events.Event {
   @Override 
   public String getEventName() { return "com.foo.my-event-name"}

   @Override 
   public String toString() {...} // for non-otel logging providers

   @Override 
   public AnyValue getBody() {...} // for otel 
}

logger.atInfo()
 .addArgument(new MyEvent(...))
 .log(null)

I know it changes everything about how we do logs today.

Related: