dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.18k stars 4.72k forks source link

Lock contention in LoggerFactory.CreateLogger #83609

Closed davidfowl closed 1 year ago

davidfowl commented 1 year ago

Description

There's lock contention when trying to create a logger using LoggerFactory.CreateLogger. The global lock protects mutating filters and logger providers. This shows up under heavy load.

Will fill in more details later, filing this as a reminder.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-extensions-logging See info in area-owners.md if you want to be subscribed.

Issue Details
### Description There's lock contention when trying to create a logger using LoggerFactory.CreateLogger. The global lock protects mutating filters and logger providers. This shows up under heavy load. Will fill in more details later, filing this as a reminder.
Author: davidfowl
Assignees: -
Labels: `tenet-performance`, `untriaged`, `area-Extensions-Logging`
Milestone: -
brantburnett commented 1 year ago

I was curious about this and did a bit of playing around (in advance of the details). ConcurrentDictionary doesn't seem to work because of the other reasons for locking the collection.

I tried ReaderWriterLockSlim since, in theory, the locks would be almost entirely reader locks. While I believe this eliminated a lot of the contention (untested) in the single-threaded read path it increased time by about 40%.

Method Job Mean Error StdDev Median Min Max Ratio RatioSD
CreateLogger Monitor 31.21 ns 0.784 ns 0.872 ns 31.47 ns 28.91 ns 32.17 ns 1.00 0.00
CreateLogger RWLockSlim 43.63 ns 1.171 ns 1.349 ns 43.35 ns 42.16 ns 46.33 ns 1.40 0.05

Then I tried this approach using a delayed build of a FrozenDictionary: https://github.com/dotnet/runtime/commit/05d5c63266f51e91c45ec1138dfbd3b82d4d8c75

The (unproven) theory here is:

Because it uses the (more expensive) path to build a read-optimized FrozenDictionary, this actually improves the speed of dictionary reads after the threshold is passed, even before considering the elimination of lock contention. If the assumptions above hold this would mean an amortized overall improvement in exchange for a bit more up-front computation building the optimized dictionary.

Method Job NumLoggers Mean Error StdDev Median Min Max Ratio RatioSD
CreateLogger Monitor 128 33.30 ns 0.841 ns 0.864 ns 33.54 ns 31.37 ns 34.63 ns 1.00 0.00
CreateLogger FrozenDict 128 17.82 ns 0.486 ns 0.559 ns 17.91 ns 16.70 ns 18.84 ns 0.54 0.02
CreateLogger Monitor 1024 33.14 ns 0.997 ns 1.148 ns 33.30 ns 30.79 ns 34.73 ns 1.00 0.00
CreateLogger FrozenDict 1024 19.02 ns 0.466 ns 0.499 ns 19.05 ns 18.22 ns 20.04 ns 0.58 0.03

I guess we'll see if this makes sense once the problem is explained fully.