qos-ch / slf4j

Simple Logging Facade for Java
http://www.slf4j.org
MIT License
2.34k stars 990 forks source link

MDC + Context Propagation Feature Request #399

Open mosesn opened 9 months ago

mosesn commented 9 months ago

Problem

MDC is a useful way to inject pertinent context into log messages. But the current design makes it challenging to work with intra-service propagation solutions, like OpenTelemetry's Context mechanism. Typically these mechanisms are basically a Map from aString-shaped key to an Object. MDC provides a clear API for what we would want to propagate, which would be MDC.getCopyOfContextMap, but when we switch threads, we would have access to that context again, but we wouldn't have a way to use that context. We would need to write code to call MDC.setContextMap, and consistently do that every time we switch threads. Instead, it would be much easier if we could have SLF4J (or its implementations) call directly into the context propagation mechanism and read the context from there. In practice, they are often doing the same thing under the hood as a logging implementation with ThreadLocal, but they also have a mechanism for propagating the context across threads (for example with OTel, they have an auto-instrumentation java agent that can add java byte code to move the context automatically across when you fork, or schedule new work).

With the current API, folks go to extreme lengths to work around the current limitations of the SLF4J API.

Proposed Solution (Option 1)

Add a public API for overriding the MDCAdapter. Today, there's no way to wrap-and-modify or replace the MDCAdapter that's provided by a logging implementation. This means that if you want to change it, you need to take your logging implementation, rip out the suggested SLF4JServiceProvider implementation, and provide your own. Or else override MDC#mdcAdapter. However, there is currently a scary message that this is not the only place where you must override MDC.mdcAdapter. Instead, it would be great for MDC to have an MDC.overrideMDCAdapter(MDCAdapter) method that would also be compatible with the SLF4JServiceProvider implementations, perhaps adding a SLF4JServiceProvider#overrideMDCAdapter(MDCAdapter) method.

This has the advantage that it provides maximum flexibility. However, it may end up being a bit of a footgun if it provides too much flexibility and eventually becomes something that customers misuse.

Proposed Solution (Option 2)

Provide a way to register an additional way to fetch MDC Context. This could be an MDCExtension interface that exposes a method MDCExtension#getContext() and MDCExtension#get(key). This would let customers provide extra info when it’s time to read MDCContext. To implement this, we could add a method to MDCAdapter to take into account this extra MDCExtension.

Result

SLF4J users who do intra-service context propagation wouldn’t need to jump through hoops to integrate with MDC.

Note

I'm happy to implement this feature once we've come to a consensus on what it should look like.

ceki commented 9 months ago

@mosesn Thank you for your detailed comments. I have added a link to this issue from Changes under consideration for SLF4J version 2.1.0

I was thinking of something similar to your proposition 2 as an extension mechanism for MDC.

Question: How does proposition 2 help in propagating the MDC context to child threads? Answer: The MDC extension has its own means to extract its own data as appropriate to the current thread, e.g. extreme lengths mentioned earlier.

Thus, if I understand correctly, the MDC extension mechanism allows the user to plugin an external MDC (mapped diagnostic context) into SLF4J's MDC. This achieves the goal of context propagation indirectly by delegating the problem to the extension but without providing a propagation mechanism in SLF4J.

ceki commented 9 months ago

@mosesn As for the scary message, developers who wish to override mdcAdapter that they might also need to call LoggerContext.setMDCAdapter. Up until 1.5.1, LoggerContext.setMDCAdapter did not allow overriding the mdcAdapter. Commit 097da9dd54c8 changes this.

mosesn commented 9 months ago

Question: How does proposition 2 help in propagating the MDC context to child threads? Answer: The MDC extension has its own means to extract its own data as appropriate to the current thread, e.g. extreme lengths mentioned earlier.

Yes, that's exactly right. I'm picturing something like MDC.extend(MDCExtension) that would let a customer register an "external MDC".

@mosesn As for the scary message, developers who wish to override mdcAdapter that they might also need to call LoggerContext.setMDCAdapter. Up until 1.5.1, LoggerContext.setMDCAdapter did not allow overriding the mdcAdapter. Commit 097da9dd54c8 changes this.

Thank you!! I'll definitely use this once it's released.

ceki commented 9 months ago

@mosesn Would like to try to implement?

mosesn commented 9 months ago

Yep, sure! Is there a branch I should do the work in? And is it acceptable to add a method to Slf4jServiceProvider or MdcAdapter, or would you want me to figure out a way to avoid doing that? (I'm thinking about how to communicate to logback that LoggingContext#mdcAdapter should take the MDCExtension into account)

ceki commented 9 months ago

I would suggest "branch_2.1.x".

It is OK to add methods to the MDCAdapter interface as long as they have default implementations do something reasonable or in the worst case don't do anything.

CharlieReitzel commented 8 months ago

Been looking at this exact question recently. We wrote custom encoders for Logback and Log4j2 to provide standard logging contexts for all applications in our division (> 100 apps). It's been working well except for those few developers using reactive APIs.

My thought for this feature is to update the Log4j API to provide logging overloads that include the reactive context. An equivalent to the MDC pattern, e.g. %Context{foo}, could be used to include context variables in the logging output. I think it would be up to developers to populate that context and pass it into Slf4j calls. This relieves Slf4j from having to invent a very difficult wheel.

One problem, is think, is that the RxJava context is not a standard item and is not uniformly supported by all applications. It seems the standard projects, e.g. RxJava, cannot see a good way forward, with a couple attempts at solving the problem abandoned over some years.

- 2015: https://github.com/ReactiveX/RxJava/issues/2885
- 2019: https://github.com/ReactiveX/RxJava/issues/6484

There is this specific Open Telemetry extension for Logback, which is doing the necessary context copying: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/logback/logback-mdc-1.0/library/src/main/java/io/opentelemetry/instrumentation/logback/mdc/v1_0/OpenTelemetryAppender.java#L88-L102

I haven't used RxJava/Reactor much. But I have done a mess of asynchronous/non-blocking code in C/C++ over the years. I don't see much hope for standardizing "context local" propagation. Thus, the best way to go for the foreseeable future is to make the logging API explicit wrt reactive context.

LvargaDS commented 7 months ago

There is support for context also in reactive/non-blocking approach of processing. At least, project-reactor does have it: https://projectreactor.io/docs/core/release/reference/#context

CharlieReitzel commented 7 months ago

There is support for context also in reactive/non-blocking approach of processing. At least, project-reactor does have it: https://projectreactor.io/docs/core/release/reference/#context

Exactly. What is missing is a logging API that will include attributes from that context similarly to how existing APIs use MDC. A reactive diagnostic context, if you will.

I'm thinking the context is the 1st parameter to a set of overloads, e.g.

logger.debug(ctx, "foo - bar: {}, biff: {}", bar, biff);