Supported way of accumulating metadata down the callstack

apple / swift-log

A Logging API for Swift

https://apple.github.io/swift-log/

Apache License 2.0

3.54k stars 294 forks source link

Supported way of accumulating metadata down the callstack #261

Open weissi opened 1 year ago

weissi commented 1 year ago

Description

I need a way where multiple, independent libraries can accumulate metadata down the callstack. So for a call stack looking like

LibA.someFunc
  -> LibB.anotherFunc
     -> LibC.yetAnotherOne
        -> LibD.loggingFunction

I want LibA.someFunc, LibB.anotherFunc and LibC.yetAnotherOne to be able to accumulate metadata keys that are attached to messages logged by LibD.loggingFunction.

For example if LibD.loggingFunction calls logger.info("hello world"), I would expect the following log message:

DATETIME INFO hello world ["LibA-Key": "A", "LibB-Key": "B", "LibC-Key": "C"]

assuming that LibA.someFunc added LibA-Key and so on.

Given that we now have Baggage and TaskLocals, I would expect this to work without having to pass Logger down the stack manually.

Concrete ask

I propose to create a standard way where all programs & libraries can retrieve and modify a Logger from "the environment" (deliberately vague here). For example, that could be a Logger in a TaskLocal; or a Baggage in a TaskLocal where the Baggage has a well-known baggage.logger key; or something entirely different.

The important bottom line is that each function can add/modify metadata values as well as other properties (such as logLevel) of the Logger.

Why are MetadataProviders not enough?

Metadata providers require (at the MetadataProvider creation time) to know all the metadata that needs to be coalesced. That cannot in general because it's impossible to determine which libraries might get called just to fish out their own MetadataProviders.

Current, working solution (manually passing logger)

This all works totally fine today in this example program

func foo(logger: Logger) async throws {
    var logger = logger
    logger[metadataKey: "foo"] = "\(UUID())"

    for f in 0..<10 {
        try await bar(logger: logger)
    }
}

func bar(logger: Logger) async throws {
    var logger = logger
    logger[metadataKey: "bar"] = "\(UUID())"
    logger.logLevel = .trace

    for f in 0..<10 {
        try await httpClient.get("https://foo.bar/buz", logger: logger)
    }
}

// main
let logger = Logger(label: "my-service")
try await foo(logger: logger)

But please note that I have to manually thread through the Logger.

FranzBusch commented 1 year ago

What do you think about adding just the metadata to the task local but still threading the logger itself around. This still reduces a bunch of logger passing to individual methods where you can just pass a logger to the struct or class`.

We could then add a build in metadata provider that pulls the metadata from this task local.

I am a bit unsure if stuffing the logger itself into the task local is the right thing to do because it then suddenly becomes like a dependency injection system. Furthermore, what do you do if the task local is not set because you jumped in and out of structured concurrency?

IMO, I would love to explore the metadata in task local first and see how that evolves before we stuff the whole logger in there.

Side note: If we do logger inside the task local we can also start providing static log methods that just pull it out and log with the one in the task local.

weissi commented 1 year ago

What do you think about adding just the metadata to the task local but still threading the logger itself around. This still reduces a bunch of logger passing to individual methods where you can just pass a logger to the struct or class`.

Better than nothing but still incomplete IMHO. That won't allow me to change the log level for known problematic areas of the code. It's often useful to say lower the log-level for a given HTTP route or maybe log every 10,000th request in debug/trace.

suddenly becomes like a dependency injection system.

TaskLocals are a dependency injection system. They allow you to essentially pass invisible parameters down the call stack.

Side note: If we do logger inside the task local we can also start providing static log methods that just pull it out and log with the one in the task local.

Yes, that's actually a really good idea. Most logging sites don't need to change or know anything about the logger, they just want to log something, having static methods makes this rather nice.

FranzBusch commented 1 year ago

Better than nothing but still incomplete IMHO. That won't allow me to change the log level for known problematic areas of the code. It's often useful to say lower the log-level for a given HTTP route or maybe log every 10,000th request in debug/trace.

I have been looking at this as well for one of our services. We solved it with a log handler that allows to set the log level dynamically during runtime on a per file basis. I quite like the approach of log handlers doing this instead of doing this at the places of logging.

TaskLocals are a dependency injection system. They allow you to essentially pass invisible parameters down the call stack.

Sure, but it is a bit of a fragile system if you step into unstructured concurrency etc. What would you do if there is no logger in the task local? fatalError or create a new one?

weissi commented 1 year ago

I have been looking at this as well for one of our services. We solved it with a log handler that allows to set the log level dynamically during runtime on a per file basis. I quite like the approach of log handlers doing this instead of doing this at the places of logging.

That's orthogonal. That sets everything to a different level, I want just want to change it for everything down from the current call stack. Not sure what you mean on a per file basis. But in general, LogHandler instances are shared by multiple loggers so you can't really do anything there.

Sure, but it is a bit of a fragile system if you step into unstructured concurrency etc. What would you do if there is no logger in the task local? fatalError or create a new one?

Yes, of course, all implicit systems are fragile (but handy; mind you totally valid opinion to say that you don't like implicit systems and avoid them). But to your question: I think the answer to this one is pretty straightforward: You fall back to a logger that doesn't have any contextual information attached to it. The only context is that we don't have the context. The user is either fine with that in which case this works or the user isn't in which case they should fix their code (either stay in structured concurrency or transport the Logger over).

I'd say a global

private let fallbackLogger = {
   var logger = Logger(label: "fallback-no-context"_
   logger[metadataKey: "context-information-unvailable"] = "true"
   return logger
}()

or something similar will do the job just fine.

FranzBusch commented 1 year ago

That's orthogonal. That sets everything to a different level, I want just want to change it for everything down from the current call stack. Not sure what you mean on a per file basis. But in general, LogHandler instances are shared by multiple loggers so you can't really do anything there.

To expand on this a bit. We have a log handler that has a dynamic log level configuration for each file. The log handler parses the file parameter of the log method and sets the log level appropriately. This allows fine grained control which can be changed during runtime.

Yes, of course, all implicit systems are fragile (but handy; mind you totally valid opinion to say that you don't like implicit systems and avoid them). But to your question: I think the answer to this one is pretty straightforward: You fall back to a logger that doesn't have any contextual information attached to it. The only context is that we don't have the context. The user is either fine with that in which case this works or the user isn't in which case they should fix their code (either stay in structured concurrency or transport the Logger over).

I do like implicit dependency injection systems when they come with compile safety. This often requires code analysis in a build phase, so I think this is out of scope here. The reason why I think having the metadata in the task local is the correct thing is that if you lose that it is not a big problem. However, putting the logger in there means that you also implicitly have the log level. So if we do a fallback to a default logger that one always has the default log level.

However, it looks like this is the thing you want to achieve here the most. Being able to update the log level dynamically from some point on downwards. I am wondering if this might be better solved with something like the above or how log4j handles this. You most likely want to change the log level for debugging crashes/weird behaviours. That is done on an application level mostly so if we can provide a configuration to the top level logger that is injected all the way down we can achieve this as well. Especially thinking about applications that depend on things like gRPC or AHC. If they want to change the log levels of some of the logs inside these libraries the task local logger approach won't help them at all because they have no control over the libs.

weissi commented 1 year ago

To expand on this a bit. We have a log handler that has a dynamic log level configuration for each file. The log handler parses the file parameter of the log method and sets the log level appropriately. This allows fine grained control which can be changed during runtime.

Right, this is orthogonal. I want everything from a certain point onwards down the call stack, even if in a completely different library. For example, assume you have these call stacks:

MyServer
  |
  +-- handleRequest
          |
          +--- LibA.foo --> AsyncHTTPClient.execute
          |
          +--- LibB.bar --> AsyncHTTPClient.execute

in handleRequest, just before the call the LibB.bar I might want to do logger.logLevel = .trace to get all of LibB and AHC's logs as trace. But everything else in the default log level.

Similarly if for example I want to say (in handleRequest): Set the logLevel to trace on all requests where the request UUID starts with AAAA which (assuming perfectly random UUIDs) should log every 64k'th request in trace in full.

weissi commented 1 year ago

how log4j handles this

In SLF4J and Log4j 1.x this is called MDC (Mapped Diagnostics Context), I think in Log4j 2.x it's called ThreadContext. Most Java libraries assume that everything is synchronous and if you have a primitive that crosses threads, then I think you're meant to detach and re-attach the MDC context accordingly. But from a logging perspective, Java's threads locals are what Swift Concurrency's task locals are to us.

MDC's context is more like our metadata but because the Java logging systems pull much more into the API than SwiftLog, you can achieve what I want to do there using the XML config files. Quickest example I could find is this: https://ravthiru.medium.com/dynamically-set-per-request-log-level-using-logback-filters-d1009e3e11fb

I think that's a hack because it requires the application/library and the configuration to cooperate. When we designed SwiftLog, we deliberately made what is called the Logger a ~value type that holds a reference to the LogHandler. [I say a ~value type because we do obviously allow for global log level overrides (say on a signal but that's all irrelevant detail here; it's meant for special use cases outside the core design which is IMHO fine because logging should be semantics neutral, so just because some logging metadata is missing, your program shouldn't suddenly start to misbehave)].

But really Logger is just a bag of metadata, log level and the LogHandler which then eventually holds an (immutable) reference to the underlying logging system that'll them emit the logs into a file/console/.... That way you can express this totally without hacks -- assuming you pass the Logger around, either explicitly or through TaskLocals/Baggage.

FranzBusch commented 1 year ago

Thinking about this some. Overall, I agree we should just lean into making the logger be a task local. This is overall good for the ecosystem since we force people to use structured concurrency.

The problem that I was describing upthread is not solved by this but I think it is orthogonal anyhow. We want to be able to dynamically update the log level of all loggers in a running process without restarting. Furthermore, we actually want to do this on a per file basis. We can make this work right now by just having a custom log handler and setting the root logger's log level to trace. In our custom log handler we can do decide dynamically what log level to use. I would love if this would be supported out of the box from swift-log but I understand that is also a no-goal since it costs performance.

weissi commented 1 year ago

Sounds all good. Yes, what you describe also needs to be supported and it is.

I think it's okay that it's not part of literally swift-log but I think it's bad that it's not part of the SwiftLog ecosystem. We're badly lacking a really good logging "middle end" which should support what you describe and much more. Logging destinations/colouring/... should just be plugged into this "middle end". That way we don't need every LogHandler to reimplement everything and have it nice and configurable without paying the cost in all of SwiftLog. (Unrelated but secretly I'm still dreaming to get https://forums.swift.org/t/zero-cost-service-provider-interface-signature-packages/6935 at some point :) )