akkadotnet / akka.net

Canonical actor model implementation for .NET with local + distributed actors in C# and F#.
http://getakka.net
Other
4.69k stars 1.04k forks source link

Check logging system for memory leaks #1734

Closed Aaronontheweb closed 8 years ago

Aaronontheweb commented 8 years ago

Related to #1724. Seen lots of historical reports indicating OOM issues caused by logging over long periods of time. Smells a lot like messages aren't being properly freed for GC after being logged.

Might be worth writing an NBench test to detect if memory allocation increases when pushing messages onto the EventStream.

object commented 8 years ago

I managed to reproduce the problem (described in #1724) in a tiny app. Here's the repo: https://github.com/object/AkkaSerilogTest

On our development machines once I start that console app, CPU usage stays on 30-40% and memory consumption quickly goes up until the program runs out of memory.

object commented 8 years ago

I can try to locate the error (will have some time in a few days). Will appreciate pointers to how to create NBench tests.

Danthar commented 8 years ago

I build a small testbed which verifies that we have an issue: https://github.com/Danthar/AkkaLoggingMemLeak

It looks like we are holding onto the log-event instances. More research is needed. But so far i have ruled out any specific logging plugin, since it happens with the default logger as well.

Im going to further reduce the test-bed until i have a minimal viable testcase.

Danthar commented 8 years ago

Ok so after https://github.com/akkadotnet/akka.net/pull/2086 was merged. This issue no longer occurs.

Basically what was happening is that under high pressure scenario's logevents where somehow being retained by system, even after they where processed by the logger. When talking about high pressure scenario's -: i have simulated a scenario where 100k messages are being logged, every 1 second.

Testing on the Serilog, NLog, slf4net and default logger would show this problem. (although the default logger would show it alot less)

This is heavy duty for any logger, especially when logging to the local file system. So some memory bloat is not unexpected. However in the 1.0.8 release after stopping the simulation and waiting for the logger to finish processing, memory usage would not drop down, and would seemingly not be freed from the internal ConcurrentQueue. I say seemingly because the ConcurrentQueue was quickly ruled out as the culprit by way of elimination testing on my part, some discussions internally with the core members, and the fact we have a large number of specs which would otherwise be having all sorts of issues.

With the merging of the aforementioned PR. Memory usage now properly drops down after the loggers have catched up and processed all its items.

The important stuff that changed in that PR is that all the Mailbox implementations are consolidated into one implementation where only the messagequeue implementation by way of the IMessageQueue can change.