Consider replacing custom Actor based logging with SLF4J

RichardBradley commented 8 years ago

Currently, Akka has a custom logging framework based on Actor messages which has its own logging API which looks very similar to SLF4J.

This can be bridged to SLF4J / Logback as described in the docs: http://doc.akka.io/docs/akka/snapshot/java/logging.html#SLF4J

This reimplementation of SLF4J causes a few problems:

API confusion and edge case bugs arising from the API being not-quite the standard, detailed at https://github.com/akka/akka/issues/16745
Edge case bugs arising from trying to log info about an ActorSystem while it is shutting down using Actors that belong to that system, detailed at https://github.com/akka/akka/issues/17010
- (a workaround is already in place to log messages that would otherwise be lost to this bug to stdout)
Extra complexity in the Akka codebase that seems irrelevant to Akka's goals

I think the best fix would be to remove the Actor-based logging and replace it with direct calls to Logback / SLF4J, with Logback configured to be non-blocking (Logback has solid support for async logging, see akka docs). This would solve the bugs linked above and reduce Akka complexity (there's no benefit to having Akka declare its own set of LogLevel enums when SLF4J already does so). See the comments on https://github.com/akka/akka/issues/16745.

I think https://xkcd.com/927/ applies -- Akka doesn't need it's own logging API; it should use SLF4J.

Of course this would be a massive breaking change :|

Thoughts? One for Akka 3 / Akka 2.next? One to write off as "that ship has sailed, sorry"?

guidomedina commented 8 years ago

I have a project which I have my own logging mailbox which is GC friendly by using https://gist.github.com/guidomedina/b71b81bedd69c64a7b65 and also I don't want to have Slf4j as a dependency.

Akka has plenty of OS plugins and one of them is akka-log4j which uses Log4j2, I don't like the idea of having LMAX disruptor also for asynchronous logging for Log4j2 so I use my own dispatcher, my own mailbox and the nice buffered file appenders, logging dispatcher can be configured to use any dispatcher or a pinned-dispatcher if you want.

As you can see in that Gist, STD out is used when draining such mailbox and it is GC friendly, Log4j2 has gone the GC friendly path because one of the main authors uses it in the financial industry so it is completely GC friendly, then I see no benefit of just adding proxy calls when I don't need it.

Akka also has akka-slf4j plugin...

rkuhn commented 8 years ago

Thanks for bringing this up! This precise question has been considered in some depth back in the 1.3 timeframe. Without wanting to answer it I can give the historical perspective.

There were two main reasons for creating a custom logging API:

logging must be fully asynchronous lest it destroy the nice properties of an actor system; at the time the options were quite limited in terms of existing asynchronous logging frameworks
akka-actor was (and still is) developed as a foundational library, one that other library authors can use to build interesting functionality on; this means that it must not have more dependencies than absolutely necessary, especially none that may constitute a choice that restricts the end-users’ freedom

The first reason has become weaker over time while the second reason is decided by how Akka is intended to be used. Becoming more opinionated (i.e. framework-like) would be a strategic decision. This would be unproblematic only in the case where there is exactly one uncontroversial choice of logging framework.

patriknw commented 7 years ago

Thanks for bringing this up. It has been raised a number of times. It's a pitty that there is not a good logging API in the standard libray that everyone is using. That is the gap that slf4j tries to bridge. Does slf4j have widespread use in all other important libraries?

I don't see that we can change this before Akka 3.0, but then it should absolutely be considered.

hseeberger commented 7 years ago

As far as I can tell, Log4J development and community are much more vibrant than SLF4J. I'd rather consider using Log4j directly for Akka 3.

He-Pin commented 7 years ago

+1 for log4j2

patriknw commented 7 years ago

We can't pick one implementation and only support that, because:

there will always be users and libraries that want to use another implementation
in 3 years there is a high risk that another implementation is more vibrant

Therefore we need the separation of the api and the pluggable backend implementation.

If slf4j is not that api then Akka must define the api. That is what Akka already does with LoggingAdapter. The missing piece is that the actual implementation of the LoggingAdapter is not pluggable. Making that pluggable would make it possible to use async logging with log4j2 or logback without the actor based backend. This can be done in 2.4.x

One can ask the question if Akka need a logging api (LoggingAdapter) at all or if it would be enough with a pluggable backend used by the internal logging.

For end user applications the Akka logging api is not important. They can use whatever logging api directly (e.g. Log4j2 api).

The reason Akka needs to define a logging api is because libraries built on top of Akka need the same pluggable backend as Akka internals, i.e. Akka must define an api for it.

guidomedina commented 7 years ago

Though I wouldn't eliminate the possibility of asynchronous logging using actors, why? Log4j2 requires LMAX disruptor dependency for asynchronous logging, while it is very efficient, it is yet another dependency and TBH, I'm already using a Log4j2 high performance queue with log GC if logging frequency is high enough:

https://gist.github.com/guidomedina/b71b81bedd69c64a7b65

All I needed to do was to define my own logger mailbox which such queue like in the example below, that's something that Akka does well so plug-ability would be awesome, but I wouldn't want the plugins to lose the ability of using actor based queues.

Kreinoee commented 7 years ago

I would say that slf4j has widespread use in libraries and other java/scala projects. This blog: http://blog.takipi.com/githubs-10000-most-popular-java-projects-here-are-the-top-libraries-they-use/ is indicating that slf4j maybe is the most used java library at all (or at least in 2013 where the post was written).

In my projects I use slf4 directly, because it has turned out to be much simpler in the long run. As a concrete example of where I have seen problems is this:

In many of my applications i have severel MDC values, like request-id, calling-user and so on. I would like these values to be part of every logging message emitted in the processing of that request. In an none async application MDC works perfectly for this. For making it work with in my async application, I have created a RequestLogger class that takes a map of mdc values on initialization. The class the have same methods as the slf4j Logger class, but each methods has an extra implicit Logger param. When called, it sets the mdc values on the Logger, logs the messages, and removes the mdc values again.

This works pretty well for me, and it works in standard async future bared scala apps, actor based apps, and even stream, as long as I get the RequestLogger object parsed arround. But if my akka based system used akka logging, this would not work anymore. Natuarlly I could just create yet another RequestLogger class that supported Akka logging, but that would not work outside of Akka, and in case of a project using both akka, and standard scala code based where an Akka LogginAdapter is hard to get, we would need to convert between the two very often.

I now this is not a huge problem, and its easy to work arround by just using slf4j directly, but then again, why should Akka then have its own logging facade, if its actually a better option to use the more widespread used alternative.

rkuhn commented 7 years ago

@Kreinoee The reasoning on this question has not changed since my post from last October: adding a dependency to akka-actor restricts all users of this library, hence doing it is a strategic change. Slf4j may be popular, but the more problematic point is that Akka would fix a version, which will inevitably collide with existing projects’ use of Slf4j. As far as I can see there is no strong reason for doing this change because users are free to employ Slf4j already today, including MDC values (I remember reviewing the implementation many years ago).

hseeberger commented 7 years ago

Also, SLF4J (and Log4j) both can be configured such that logging happens asynchronous and non-blocking, but that's just an option. Therefore I consider switching to such an implementation very dangerous.

Kreinoee commented 7 years ago

It's without doubt problematic to use slf4j directly, and then configure log4j or logback in a none async way, but this is also a danger when using Akka logging. Its pretty easy to make a project that otherwise use Akkas logging, implicit use slf4j directly, by using code from libraries inside your actors. A typically project today is using a lot of libraries, and there a big chance that some of these libraries is logging via slf4j. In such a situation it's just as dangerous to not configure the logging backend to be async, but the need of doing so is far less obvious.

He-Pin commented 7 years ago

but I still like it to be pluggable, the only thing I don't like is the SPI is based on actor

I more prefer one based on trait and with the default implemented by actors.

hseeberger commented 7 years ago

@Kreinoee from the perspective of someone who uses Akka to build some system, you are right: Akka can't solve all problems for you. But as a library Akka itself – as stated elsewhere – must avoid/minimize several risks and therefore neither can directly use a potentially blocking logging solution nor lock into a single non-standard implementation.

rolandjohann commented 6 years ago

In a distributed environment where there are log collectors relying on specific log formats (e.g. ELK) it is problematic to emit potentially critical log entries with invalid format. I faced this especially with Akka Stream Kafka Sources, where there are thread dumps on Actor System shutdown which won't be encoded as json log entry having multiple hundreds of lines. Disabling stdout-logging completely is not an option either.

Being able to specify the implementation of StdOutLogger or at least how it actually behaves on print would solve the issue. Then there can be custom implementations that can rely on whatever logging framework and the core lib is still independent.

Is this issue/repo the right place to discuss that or should that be moved elsewhere?

johanandren commented 6 years ago

I think that is different from this topic, please open an issue over in the akka/akka issue tracker about that @rolandjohann

wsargent commented 5 years ago

My thoughts:

SLF4J is not inherently a blocking API.
It is by far the most widely used API.
SLF4J API contains the Marker API, which is awkward to use in Akka (must use Logging.withMarker, etc) https://doc.akka.io/docs/akka/current/logging.html?language=scala#using-markers
Using SLF4J API with the async appenders from logstash-logback-appender is just as fast as Log4J2, but even there Log4J2 can be used as the engine.

I think it would make things much easier for people to have akka-slf4j be the configured default, and provide a configured Logback / Log4J2 with async appender as available examples.

hseeberger commented 5 years ago

@wsargent the point is that SLF4J/Logstash like other logging libraries are not non-blocking, you have to configure it correctly (and most defaults are blocking as far as I know).

RichardBradley commented 5 years ago

I think you are conflating the SLF4J logging API with its backend implementations.

The SLF4j API is neither blocking nor non-blocking, it is just an entry point. The backend logger implementation is completely pluggable. Many common backends of SLF4J are blocking, but there are good quality non-blocking backends easily available (and maintained outside of AKKA, which would remove a maintenance burden from AKKA if it were able to use that instead of implementing its own with all the bugs and edge cases I listed at the top of this thread).

Moreover, many libraries are compiled against the SLF4J logging API (or compatible APIs that can be redirected into SLF4J, such as log4j, log4j2 and java.util.logging). For AKKA to help its users avoid accidental blocking I/O via logging, it ought to capture all logging and redirect it to its own non-blocking backend. This is easier if an AKKA-backed project has taken over the SLF4J API than if it uses its own custom logging API, as it does at present. (This is the same point that @Kreinoee describes above)

wsargent commented 5 years ago

@wsargent the point is that SLF4J/Logstash like other logging libraries are not non-blocking, you have to configure it correctly (and most defaults are blocking as far as I know).

@hseeberger SLF4J is an API, it's neither blocking nor non-blocking. There is no default logback configuration: it's all custom to individual workloads, just as Akka's dispatchers must be configured and tuned for individual workloads -- and a copy/pasted logback.xml is just as misconfigured as having a single global dispatcher for Akka as a default would be. @RichardBradley is correct on all points.

guidomedina commented 5 years ago

@wsargent Akka default logging facility logs via another actor which makes it non-blocking, and then there is the akka-slf4j and akka-log4j plugins which are extensions of such actor logging functionality so I honestly don't see the point of having Akka to directly use slf4j

The current implementation allows me to plugin low GC footprint logging mailboxes, here is an example with log4j v2, I don't have to worry and configure my logging API to be async: https://gist.github.com/guidomedina/b71b81bedd69c64a7b65

hseeberger commented 5 years ago

@RichardBradley that's exactly the point: "The SLF4j API is neither blocking nor non-blocking". Like Seq neither being immutable nor mutable in Scala. Just wrong.

dwijnand commented 5 years ago

Does Akka using a not non-blocking API like SLF4J affect its correctness?

RichardBradley commented 5 years ago

guidomedina: I honestly don't see the point of having Akka to directly use slf4j

I outlined the benefits of this at the top of this thread. If you do not agree, that is fine, but stating multiple times that you do not need it does not advance this discussion.

To recap, the benefits would be:

Avoiding logging bugs: having the logging happen inside the actor system is a circular dependency that makes logging startup and shutdown difficult and has led to bugs in the past
Allows AKKA users to use a familiar, standard, well tested and well documented logging API, rather than having an AKKA-specific logging API that is nearly but not quite the same
Makes working with third party code that does its own logging much easier and much more consistent. This is discussed at length above.
Less complexity in AKKA: no need to declare and maintain an independent logging API

hseeberger: that's exactly the point: "The SLF4j API is neither blocking nor non-blocking". Like Seq neither being immutable nor mutable in Scala. Just wrong.

This is nonsensical and does not advance the discussion. Neither AKKA's logging api, e.g.

log.error(reason, "Restarting due to [{}] when processing [{}]",
  reason.getMessage(), message.isPresent() ? message.get() : "");

(example from here) nor SLF4j's API, e.g.

logger.debug("Temperature set to {}. Old temperature was {}.", t, oldT);

(example from here) have any "blocking" or "non-blocking" attributes at the API level. It is also true that the Seq API is neither immutable nor mutable, but that seems a rather strange thing to say and completely beside the point. Please let's not digress into a discussion about the Scala collections API.

guidomedina commented 5 years ago

Well, it has been almost 3 years, reasons have been given in favor and against, the fact that it is nice for some doesn't mean it doesn't have drawbacks for others, Akka by nature is and encourages non-blocking APIs, also; there is principle of minimal dependencies stated already by Akka team members.

It is true that slf4j is just a bridge, it is neither blocking nor non-blocking, but the target logging facility is blocking by default which violates one of the main points of Akka actors approach which is non-blocking.

Also, slf4j doesn't take advantage of optimizations done at log4j v2, I wouldn't mention log4j v1 because is dead, truly, today you could only consider logback and log4j v2 as logging APIs, there are many improvements made to log4j v2 which are better off being used by akka-log4j directly, one of these main benefit is a huge which is string optimization when using place holders, I don't think you have that optimization present when using slf4j.

Hopefully opposing your request because it can affect others counts as "advancing the discussion"...

wsargent commented 5 years ago

I can add something more substantive to this discussion: I wrote a logging API that sat on top of SLF4J and provided implicit context, so things like correlation ID would be added automatically. This worked fine through 99% of the codebase, but required multiple fixes and a patch to Akka to get it to work correctly. This continues to be a problem.

Providing a direct SLF4J mapping would make this simpler.

wsargent commented 5 years ago

Also, slf4j doesn't take advantage of optimizations done at log4j v2,

This doesn't mean that akka-log4j can take advantage of optimizations done at log4j2 v2 though, since it's going through the Akka Logger API. Unless you use the log4j2 API directly, you'll still be using an adapter in between, and at that point I believe the performance should be entirely dependent on the appender.

I don't see any benchmarks for akka-log4j with an async appender vs akka-slf4j using logstack-logback-encoder with an async-appender, but I'm willing to bet it's a wash.

guidomedina commented 5 years ago

@wsargent the last Akka application I developed I didn't use akka-slf4j but akka-log4j plugin which uses log4j2 directly, I also added the slf4j-to-log4j2 jar so that 3rd party APIs would use log4j as well, though the logging of these were minimal.

Log4j provides also a very optimized file appender which combined with non-blocking queues is the perfect formula for applications that require low-ish latency, again; this might not be the case for all applications but it is a real thing for the financial industry which one of the main authors of log4j works for hence his emphasis on this "minor" detail.

Don't take my word for it, you can read about their benchmark at their own website: https://logging.apache.org/log4j/2.x/performance.html

jrudolph commented 5 years ago

Thanks for having this discussion.

We should probably keep these two things apart:

How Akka itself logs (and how to fix ActorSystem startup/shutdown logging problems)
What API users of Akka should use

Right now the recommendation is to use Akka's logging infrastructure for both.

In https://github.com/akka/akka-meta/issues/31#issuecomment-253976525, @patriknw gave a good overview why Akka itself needs some abstraction for logging (there's precedent for that: netty also provides abstractions around a set of logging APIs for its internal logging needs).

As we can see in this ticket, there's some debate what to recommend to users to use. I personally mostly use Akka's logging facilities for application logging for convenience but I definitely understand its downsides. I'm not even completely convinced that a non-blocking logging API is mandatory. A blocking synchronous logging API can slow down execution and lead to bad CPU utilization. However, that depends a lot on the implementation. A bigger problem is that there's the risk of contention between different logging threads which is usually solved in a better way for asynchronous logging. For small logging loads it probably doesn't matter. For high logging loads, there's a problem anyway because as soon as you log more than the logging backend can process, you will have to either throw away log messages or backpressure the loggers. As we know, synchronous APIs deal well with backpressure while asynchronous ones need to be explicitly aware of these issues.

In case of logging more than the backend can process, asynchronous loggers can run out of memory if they are not well configured to drop messages in that case. I've seen Akka applications fail fatally with OOM because of log messages being buffered indefinitely.

With that in mind, you could say that Akka's default logging API is non-blocking by default (good) but not necessarily the best implementation of it (can fail with OOM).

Regarding performance: Log4j2 has some impressive numbers but think about whether they really matter for your use case. Even if the API can process 18,000,000 msg/s, you still need a backend that can process those. Still, less CPU needed is a good thing.

I wonder, if an Akka logging backend for Log4j2 can actually realize those speeds though, as the hard thing in an asynchronous logger is managing the contention between logging threads. Putting the event bus in front of the API means that you use Akka's "synchronization" facilities (logger mailbox) instead of Log4j2's which will see no contention because it's a single actor that will push all messages though Log4j2's API.

We haven't come to a conclusion whether we want to change the recommendation in the future. I personally would say, that an application should choose whatever logging API they see fit.

guidomedina commented 5 years ago

@jrudolph that's a very good reasoning, I agree that without any back pressure things can go horribly wrong, that's why I sent a PR long time ago which is already part of Akka that allows the user to change the logging actor mailbox to whatever you please, for the financial industry application I developed I used this to change to a non-blocking bounded queue which will drop messages if over 10K, that combined with akka-log4j and the optimized random file appender log4j2 offers, it gave me very good results, but that's my experience which is bias.

I believe the low level + freedom to do whatever you need or your application requires is a fundamental principle here and should probably be the main driver for any decision to be made.

wsargent commented 5 years ago

Don't take my word for it, you can read about their benchmark at their own website: logging.apache.org/log4j/2.x/performance.html

@guidomedina those benchmarks are against logback's in-house async appender. I'm talking about benchmarking against the LMAX disruptor based async appender.

Regardless of whether you're using akka-log4j or akka-slf4j, you're still going through the Akka Logger API itself, and so any logs written through that API are not going to get ParameterizedMessage and the like.

EDIT: ran a series of JMH benchmarks to show there's no difference between Log4J 2 and Logback here -- https://github.com/wsargent/slf4j-benchmark/blob/master/README.md

wsargent commented 5 years ago

How Akka itself logs (and how to fix ActorSystem startup/shutdown logging problems)

What API users of Akka should use

Right, I should be more clear.

I don't personally care what Akka uses under the hood. I find it awkward that the only way to get decent logging out of Akka right now is to set logging both in application.conf, especially setting loglevel to DEBUG:

akka {
  loglevel = DEBUG
  loggers = ["akka.event.slf4j.Slf4jLogger"]
  logging-filter = "akka.event.slf4j.Slf4jLoggingFilter"
}

https://doc.akka.io/docs/akka/current/logging.html?language=scala#slf4j

That's unnecessary. Adding the relevant loggers and filters should be managed automatically by adding akka-slf4j to the classpath.

It's more understandable that you have to specifically set up loggers in logback.xml:

<!-- Set logging for all Akka library classes to INFO -->
<logger name="akka" level="INFO" />
<!-- Set a specific actor to DEBUG -->
<logger name="actors.MyActor" level="DEBUG" />

There's an issue in that Logback defines both logging level and logger appenders at the same layer, but that's fixable. Setting up an async appender is not the default, but you can have an akka-logback that references an include directive that will contain async appenders, and so provide useful defaults out of the box.

But those are all nice to haves.

What is more central to me as an SLF4J end user is that there's no out of the box way to represent a LoggingAdapter into an instance of org.slf4j.Logger, and the logging library in Akka and the directives in akka-http do not go out of their way to help. With an SLF4J logger in scope, even if it were simply a bridge on top of LoggingAdapter, that would solve 90% of the issues for me personally.

wsargent commented 5 years ago

Wrote up a project showing use of a "default" logback with SLF4J structured logging: https://github.com/tersesystems/terse-logback

Note the ProxyContextLogger in https://github.com/tersesystems/terse-logback#markers -- without an instance of org.slf4j.Logger, I can't capture the additional context. So the LoggingAdapter needs an SLF4J proxy to match APIs that work on top, like slf4j-fluent.

johanandren commented 5 years ago

Based on these discussions we (the akka team) are discussing/looking into if we can leverage SLF4J in Akka Typed. There are quite a few factors to take into account though (not postponing a final release of the typed APIs for one). The untyped logging APIs will very likely not change.

Two paths we have discussed:

have the typed logger (available from ActorContext.log) extend the SLF4J logging API but back it with the current logging infrastructure - this would likely not fulfill all wishes in this discussion, but some of them and it would allow us to keep the current testing tools etc.
rely completely on SLF4J for the logging in typed actors - this means we would have to find ways to rewrite the testing tools (EventFilter and the log queue in BehaviorTestKit) to capture SLF4J log events

Both seems doable, but the question is how much work it is. Help from the community exploring the two options is welcome.

RichardBradley commented 5 years ago

we would have to find ways to rewrite the testing tools (EventFilter and the log queue in BehaviorTestKit) to capture SLF4J log events

There is good support for this already. Several options are discussed at https://stackoverflow.com/questions/29076981/how-to-intercept-slf4j-logging-via-a-junit-test

raboof commented 5 years ago

we would have to find ways to rewrite the testing tools (EventFilter and the log queue in BehaviorTestKit) to capture SLF4J log events

There is good support for this already. Several options are discussed at https://stackoverflow.com/questions/29076981/how-to-intercept-slf4j-logging-via-a-junit-test

Yes - these are great building blocks and I'm fairly confident our testing tools could be implemented on top of those - but it will take some work ;).

Falmarri commented 5 years ago

One of the big benefits of using a logging API (or modifying akka's) is to not pre-format strings. For example, logging logger.error("some {}", "error") gets converted to "some error" before it gets to the logging framework. That means we can't use custom appenders to handle the log messages, specifically to understand the difference between templates. For example, sentry.io groups events based on the log template https://docs.sentry.io/data-management/rollups/?platform=java#grouping-priorities which means that all errors coming from within akka will never be grouped

wsargent commented 5 years ago

Added https://github.com/wsargent/slf4j-benchmark/blob/master/README.md to show the difference in latency when using string formatting, and comparing Logback vs Log4J 2.

The JMH test covers the 99.9% average assuming normal distributions, but doesn't worry about spikes or throughput. In short, we assume that the application doesn't log at a rate of 128K messages/second, is not I/O or GC bound, and doesn't have a backed up CPU work queue, i.e. CPU utilization is less than 70%.

For Logback, when logging is disabled:

Using a conditional statement such as if (logger.isDebugEnabled) { ... } takes 1.6 nanoseconds.
A raw statement such as logger.debug("hello world") takes 1.8 nanoseconds.
A statement that uses string interpolation or string logger.debug("hello " + name) takes 60 nanoseconds.

For Logback, when logging is enabled, CPU time depends heavily on the appender:

With a no-op appender, logging takes between 24 and 84 nanoseconds.
With a disruptor based async appender logging to no-op, between 150 and 350 nanoseconds.
With a straight file appender with no immediate flush, between 636 and 850 nanoseconds.

For Log4J 2, CPU time also depends on the appender:

With a no-op appender, logging takes between 135 and 244 nanoseconds.
With a disruptor based async appender logging to no-op, between 860 and 1047 nanoseconds.
With a straight file appender with buffered IO and no immediate flush, between 307 and 405 nanoseconds.

There's no huge difference between Log4J 2 and Logback. 1000 nanoseconds is 0.001 millisecond. A decent HTTP response takes around 70 - 100 milliseconds, and a decent HTTP framework will process around 10K requests a second on an AWS c4 instance. If you're using event based logging, then you'll generate 10K logging events per second, per instance, and then you'll also have a smattering of errors and warnings on top of that in production.

patriknw commented 5 years ago

For everybody interested in this topic I can announce that we have made the switch to SLF4J in Akka Typed version 2.6.0-M7. Give it a try and let me know if you have questions or feedback. If you are interested in the details they can be found in PR https://github.com/akka/akka/pull/27552

We were able to pull off some nice testing utilities:

LoggingEventFilter for testing of logging events matching some criteria are emitted
LogCapturing to silence logging output until there is a failure and then emit buffered events

Reference docs are still lacking, but will be added shortly.

ennru commented 4 years ago

The docs are to be found at https://doc.akka.io/docs/akka/current/typed/logging.html

akka / akka-meta

Consider replacing custom Actor based logging with SLF4J #31