serilog-contrib / serilog-sinks-grafana-loki

A Serilog sink sending log events to Grafana Loki
MIT License
199 stars 30 forks source link

Serilog.Sinks.Grafana.Loki causes memory leak on .NET 8 #257

Closed andreiZi closed 6 months ago

andreiZi commented 6 months ago

Which version of Serilog.Sinks.Grafana.Loki are you using?

v8.0.2

Which version of .NET are you using?

net8.0

Describe the bug

Hello,

I recently upgraded my services to .NET 8 and encountered a significant memory leak that caused all my services to crash, with RAM usage spiking to 2-3 GB.

Here are the relevant discussion threads:

Despite resolutions in all reported issues, the problem persisted, and strangely, I seemed to be the only one still experiencing it.

Upon analyzing the dump and comparing it with my .NET 6 applications, I discovered that the symbol:

System.Collections.Generic.Dictionary<System.String, Serilog.Events.LogEventPropertyValue>+Entry[]

was newly introduced and was causing the memory leak. After removing the associated package, all memory leaks ceased across all systems.

I've posted very detailed explanations in the threads linked above, so I recommend going through them.

My hypothesis is that this package attempts to cache logs when Loki is unreachable (we haven’t yet set up Loki in the system). This is similar to the behavior expected with tools like Seq, where if Seq is not operational, it doesn't impact the system—however, this was just an assumption. Interestingly, I also experienced the issue with Loki running, noting that the memory consumption rose slowly but was still present.

To Reproduce

  1. Start a service with the specified package enabled.
  2. Ensure that no Grafana instance is running (note: if Grafana is running, expect the memory leak to grow more slowly).
  3. If running in Kubernetes, set the memory limit high enough to observe the abnormal behavior.

Expected behavior

No memory leak

Log/SelfLog output or exception with stacktrace

Output dumpheap -stat

.NET 6

7fc38b6d5270 41,696  1,000,704 Serilog.Events.ScalarValue
7fc38862e830 22,357  1,073,136 System.Text.StringBuilder
7fc3898ee320 38,939  1,246,048 Serilog.Events.LogEventProperty
7fc38beb6460  7,854  1,958,448 System.Collections.Generic.Dictionary<System.String, Serilog.Events.LogEventPropertyValue>+Entry[]
7fc387cc9190 10,529  3,262,729 System.Byte[]
7fc3872f67d8 22,508 11,251,310 System.Char[]
7fc3872a1038 40,739 19,927,984 System.String
Total 461,985 objects, 57,327,493 bytes

.NET 8

7fca8bca63b8     312    455,082 System.Char[]
7fca906bdb58  22,620    723,840 Serilog.Events.StructureValue
7fca8e28b438  23,462    750,784 Serilog.Events.LogEventProperty
7fca8e53bf28  22,623    904,920 Serilog.Sinks.Grafana.Loki.Models.LokiLogEvent
7fca8ef08948  23,187    932,320 Serilog.Events.LogEventProperty[]
7fca906b7718  22,623  1,809,840 Serilog.Events.LogEvent
7fca906d8bb0  22,623  1,809,840 System.Collections.Generic.Dictionary<System.String, Serilog.Events.LogEventPropertyValue>
7fca8bbd89e0  23,440  1,967,548 System.Int32[]
7fca9024e6b0 191,221  4,589,304 Serilog.Events.ScalarValue
7fca906d9a30  22,905  8,306,688 System.Collections.Generic.Dictionary<System.String, Serilog.Events.LogEventPropertyValue>+Entry[]
7fca8bbdd7c8 134,245 26,716,442 System.String
7fca8c769800   4,299 50,249,261 System.Byte[]
560b529c1790  13,560 75,716,040 Free

Application or code sample, which could be used to reproduce a bug

No response

Additional context

No response

I have read the documentation

mishamyte commented 6 months ago

Hi @andreiZi,

Could I ask you to share your configuration for that sink?

Caz for me it looks like #141.

And looks like it's not related to .NET version (probably)

andreiZi commented 6 months ago

Sure

 var loggerConfiguration = new LoggerConfiguration()
            .MinimumLevel.Is(isDev ? Serilog.Events.LogEventLevel.Verbose : Serilog.Events.LogEventLevel.Information)
            .Enrich.FromLogContext()
            .WriteTo.GrafanaLoki(serilogLokiUrl)
            .Enrich.WithProperty("AppName", appName)
            .ReadFrom.Configuration(configuration);
andreiZi commented 6 months ago

Referring to issue #141, it seems related to the problem I'm experiencing. It took some time and debugging to trace the cause, but I'm glad to have found it. 😄

Is there a workaround available, or must I ensure that my Grafana is always operational to prevent this issue?

mishamyte commented 6 months ago

I would propose to set the reasonable limit during the log configuration. It could be done by queueLimit parameter all parameters are described there.

The only reason why it was not done in v8 because it's a breaking change of the default behavior and there is a way to configure it. But in v9 it would be for sure changed

andreiZi commented 6 months ago

Thank you I will do this, you can close this issue as it seems to be resolved and is a duplicate of #141. If i notice something is still off I will document my findings here. :)

Thank you for the help & fast response tho!

github-actions[bot] commented 6 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue.