serilog / serilog-sinks-async

An asynchronous wrapper for Serilog sinks that logs on a background thread
Apache License 2.0
218 stars 31 forks source link

Async sink lock contention #86

Open JanEggers opened 1 year ago

JanEggers commented 1 year ago

im currently investigating lock contention issues in our app due to seeing "high" latency in our app.

(I borrowed the Blocking detector from ben adams)

one thing that pops up is the async sink:

 at System.Threading.SynchronizationContext.InvokeWaitMethodHelper(SynchronizationContext syncContext, IntPtr[] waitHandles, Boolean waitAll, Int32 millisecondsTimeout)
   at System.Threading.Monitor.ReliableEnter(Object obj, Boolean& lockTaken)
   at System.Threading.SemaphoreSlim.Release(Int32 releaseCount)
   at System.Threading.SemaphoreSlim.Release()
   at System.Collections.Concurrent.BlockingCollection`1.TryAddWithNoTimeValidation(T item, Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at Serilog.Sinks.Async.BackgroundWorkerSink.Emit(LogEvent logEvent)
   at Serilog.Core.Sinks.SafeAggregateSink.Emit(LogEvent logEvent)
   at Serilog.Extensions.Logging.SerilogLogger.Write[TState](LogEventLevel level, EventId eventId, TState state, Exception exception, Func`3 formatter)
   at Serilog.Extensions.Logging.SerilogLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
   at Microsoft.Extensions.Logging.Logger.<Log>g__LoggerLog|13_0[TState](LogLevel logLevel, EventId eventId, ILogger logger, Exception exception, Func`3 formatter, List`1& exceptions, TState& state)
   at Microsoft.Extensions.Logging.Logger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)
   at Microsoft.Extensions.Logging.LoggerExtensions.Log(ILogger logger, LogLevel logLevel, EventId eventId, Exception exception, String message, Object[] args)

did you consider using immutable collections to avoid locking?

the lock is taken for up to 20ms which is not ideal

bartelink commented 1 year ago

See https://github.com/serilog/serilog-sinks-async/issues/82 and #63 (and maybe https://stackoverflow.com/a/76579110/11635)

The current impl is terse, legible and has clear tests. IME it also has good throughput - likely not hindered by the fact that the consumer end of that lock permanently occupies a thread (and likely warm L1/L2 cache space) via GetConsumingEnumerable.

If it was being written today, Channels might be the goto solution. The cited Channels based Sink may be worth a try to validate how much better this can be. Either way, the constraints are that fundamentally a) multiple threads need to feed writes into the same channel for a consumer to route them. b) in some cases, for better or for worse, people might actually want the blocking behavior

Based on the above, I'd say the best path to answering this is to fork the sink (or try the other one), validate it in your context, and report back the potential improvement. If a significant improvement is possible, it may be interesting. But this Sink is definitely an example of something you don't randomly 'refactor' without plenty proof via extensive benchmark.net tests backed by use in prod before such a change would make it into the dev branch, much less the master.

Also, this 20ms of which you speak... Yes that's a long time, but: what version of the FW/netcore, what is the average, does that repro in the small with Benchmark.net and what is the perf of some proposed alternative.

JanEggers commented 1 year ago

@bartelink i saw that issues but they point (at least for me) to 404. ill create implementations both on channel and immutable collection so we can discuss if the gains are worth it

bartelink commented 1 year ago

Ah, pity about the 404. Maybe @ the author in 82 and someone (I have permissions but would prefer not to do it) can come and close it if there is no response. Not sure how Immutable Collections will work as it needs to be a queue that has multi writers and a single reader that can drain it (and be monitored and/or block if someone wants blocking behavior), but you can answer that in code I guess.

JanEggers commented 1 year ago

im not able to run the provided latency benchmarks

SelfLog triggered ExitCode != 0

any advice?

bartelink commented 1 year ago

Sorry, unfortunately none: I've not personally run/looked at them (and can't even recall if they existed when I did some monitoring stuff in this repo ca 2017)

JanEggers commented 1 year ago

ill try and debug...

JanEggers commented 1 year ago

I implemented prototypes for both versions i envisioned one with channels and one with immutable list

https://github.com/JanEggers/serilog-sinks-async/tree/prototypes

channels look promising as they seem to improve the throughput by 4x.

but in regards to latency they are all slower than the current version and i had to drop support for net 4.5 and netstandard 1.1 to even be able to compile.

i still need to create some kind of benchmark that takes multithreading into account to generate the lock contention im seeing in my real world app

timstokman commented 5 months ago

Just adding some data, when I add Async around the console sync:

logConfiguration.WriteTo.Async(l => l.Console());

Instead of:

logConfiguration.WriteTo.Console();

A full 14.8% of the CPU in my application is used by the BlockingCollection in BackgroundWorkerSink. If I remove the Async sink, I don't see serilog anywhere significantly in my profiler trace.

image

Makes the Async sink unusable performance-wise outside extreme cases for my applications. This trace is for when I set the minimum log level to warning, so that I'm only logging a few lines here and there.

nblumhardt commented 5 months ago

Follow-up on the post above, CPU isn't actually being consumed: https://github.com/serilog/serilog-sinks-async/issues/63#issuecomment-1902746705