dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.24k stars 4.73k forks source link

IOptionsSnapshot is very slow #53793

Open NinoFloris opened 3 years ago

NinoFloris commented 3 years ago

This came up after a perf investigation on a recently refactored endpoint. Moving us back from IOptionsSnapshot to IOptionsMonitor knocked off ~100us, nothing else changed.

This option type of ours is fairly costly to create and looking into OptionsManager<TOptions> showed an obvious clue why we saw issues. OptionsManager does not just cache the instance for the duration of the DI scope but it also creates from scratch an instance through the factory per scope.

Looking through the docs this is vaguely alluded to (though there are contra-indicators) but after doing a search on SO and blogs on the subject most users seem to focus entirely on the 'you get a constant value during a scope/request' aspect of it.

As nobody (famous last words) seems to care whether it is entirely recreated my question would be: Why doesn't OptionsManager<TOptions> cache the result of an IOptionsMonitor.Get call? (leaving out the backwards compatibility part for the sake of argument)

ghost commented 3 years ago

Tagging subscribers to this area: @maryamariyan See info in area-owners.md if you want to be subscribed.

Issue Details
This came up after a perf investigation on a recently refactored endpoint. Moving us back from IOptionsSnapshot to IOptionsMonitor knocked off ~100us, nothing else changed. This option type of ours is fairly costly to create, and looking into `OptionsManager` showed an obvious clue why we saw issues. `OptionsManager` does not just cache the instance for the duration of the DI scope but it also *creates* from scratch an instance through the factory *per request*. Looking through the docs this is vaguely alluded to (though there are contra-indicators) but after doing a search on SO and blogs on the subject most users seem to focus entirely on the 'you get a constant value during a scope/request' aspect of it. As nobody (famous last words) seems to care whether it is entirely recreated my question would be: Why doesn't `OptionsManager` cache the result of an `IOptionsMonitor.Get` call? (leaving out the backwards compatibility part for the sake of argument)
Author: NinoFloris
Assignees: -
Labels: `area-Extensions-Options`, `tenet-performance`, `untriaged`
Milestone: -
davidfowl commented 3 years ago

Yes this is a known issue https://github.com/dotnet/runtime/issues/42222. IOptionsSnapshot is currently a performance trap.

As nobody (famous last words) seems to care whether it is entirely recreated my question would be: Why doesn't OptionsManager cache the result of an IOptionsMonitor.Get call? (leaving out the backwards compatibility part for the sake of argument)

Agreed that this implementation would have been more efficient. My only guess as to why it was done this way was because we had unified IOptions and IOptionsSnapsot implementations via the OptionsManager (though that's now changed in .NET 6).

The sad truth is that it would have been fixed if our own components were using it.

I'd be willing to take this change and see if there was any real impact not re-running factories per request. There are some very subtle differences but maybe worth the break.

cc @HaoK to see if he remembers why it was done this way.

HaoK commented 3 years ago

The original idea was: IOptions is singleton and cached forever IOptionsSnapshot is supposed to be scoped and recomputed per request (which is fine for some things, that don't involve config) (mostly exists for backcompat) IOptionsMonitor is supposed to combine the ability to have options updates with caching semantics, change notifications and cache eviction

HaoK commented 3 years ago

If we are talking specifically about IOptionsSnapshot, that was intended to be really the no-caching you get a really new fresh instance on every scope, so it literally was intended to recompute everything on every request

davidfowl commented 3 years ago

Right. We should consider changing the implementation so that it caches the currently cached value so scoped services are a consistent view of the data but there's no need to recompute if nothing changes

NinoFloris commented 3 years ago

@davidfowl I see it's been placed in 7.0 is a PR for 6.0 still welcome?

davidfowl commented 3 years ago

Yes, if you send a PR, it can be done without adding public API (I believe).

NinoFloris commented 3 years ago

I agree, alright will be cooking it up.

NinoFloris commented 3 years ago

@davidfowl started work on it just now, practically the change is extremely simple if OptionsManager can take IOptionsMonitor instead of IOptionsFactory. This does mean an api compat error for the constructor, what's the policy on those? Alternatively I could duplicate the OptionsMonitor code into OptionsManager as that's as good as it gets with access to just IOptionsFactory.

https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/Microsoft.Extensions.Options/src/OptionsManager.cs#L17-L27

Ideally I'd also inline that silly cache (which is only used in OptionsManager) but it's a public type so I guess that's a no ;)

davidfowl commented 3 years ago

Or just make a new internal type so we don't need to change the public surface here. Call it OptionsSnapshot\<T>

pentp commented 3 years ago

This should be re-opened since it was reverted in #57570?

eerhardt commented 3 years ago

Yes - good call, I had forgotten.

davidfowl commented 3 years ago

So the performance issue is by design but the thing that can be cached here is the parsing of configuration. We should be able to only read that once until it changes, VS reading it once per request. The scoped IConfigureOptions<T> need to run per call, that's by design and can use arbitrary scoped services to change the options value.

NinoFloris commented 3 years ago

Without doing some hacky diffing how would you propose to request just the IConfigureOptions that have a scoped registration?

NinoFloris commented 3 years ago

@davidfowl that was an honest question, I would like to fix it up but suggestions are welcome.

dmitriyse commented 3 years ago

IMHO the root problem of the interface IOptionsSnapshot<T> that it's suitable for multiple scenarios: 1) It allows to fix configuration state at the beginning of request and be consistent among scoped services. 2) It allows to write clean code, if it enough for us to get options only time per lifetime scope and don't track changes during the scope lifetime, we can prefer IOptionsShapshot<T> over IOptionsMonitor<T> to express it. 3) It allows to recalculate options always in each scope. Only this scenario creates the performance problem as we unable to cache the value between scopes.

Documentation states that IOptionsSnapshot<T> at once suitable for all scenarios, We can keep all this without changes.

if developer know that each subsequent recalculation will give the same result (in between of configuration change events), then he can enable caching and it will be an optimization.

We can at the time of services registration explicitly allow caching of the IOptionsSnapshot<T> among scopes.

Implementation idea: The registration logic could be extended.

        public static IServiceCollection Configure<TOptions>(this IServiceCollection services, string name, IConfiguration config, Action<BinderOptions> configureBinder, 
           bool allowCaching // New argument
)
            where TOptions : class
        {
            if (services == null)
            {
                throw new ArgumentNullException(nameof(services));
            }

            if (config == null)
            {
                throw new ArgumentNullException(nameof(config));
            }

            services.AddOptions()
            services.AddSingleton<IOptionsChangeTokenSource<TOptions>>(new ConfigurationChangeTokenSource<TOptions>(name, config));

            if (allowCaching)
            {
                 // Hope DI supports overriding of open generics by specific registrations.
                 services.AddScoped<IOptionsSnapshot<T>, OptionsSnapshotCached<T>>();
            }
            return services.AddSingleton<IConfigureOptions<TOptions>>(new NamedConfigureFromConfigurationOptions<TOptions>(name, config, configureBinder));
        }

    // Just an idea, final code should be well implemented with various optimizations
    public class OptionsSnapshotCached<T>: IOptionsSnapshot<T>
    {
        private readonly ConcurrentDictionary<string, T> _scopeCache = new (); 
        private readonly IOptionsMonitor<T> _optionsMonitor;
        public OptionsSnapshotCached(IOptionsMonitor<T> optionsMonitor)
        {
             _optionsMonitor = optionsMonitor;
        }
        public T Value => _scopeCache.GetOrAdd(string.Empty, _ =>_optionsMonitor.Value);
        public T Get(string name) => _scopeCache.GetOrAdd(name, key => _optionsMonitor.Get(key));
    }
tompazourek commented 3 years ago

So the performance issue is by design but the thing that can be cached here is the parsing of configuration. We should be able to only read that once until it changes, VS reading it once per request.

However, there's a possible issue. Parsing of the configuration applies the configuration values to an existing TOptions instance. And it's possible that the existing TOptions instance is different each time depending on which scoped IConfigureOptions services were executed before the configuration-binding IConfigureOptions. I'm not sure which part can actually be cached given how the configuration binding currently works. Maybe if the entire configuration binding would be rewritten (something like #36130) so it's blazing fast and has some sort of caching inside of it, then this whole issue with IOptionsSnapshot being slow would disappear. But as the configuration binding is currently written, I'm not sure what's there to cache...

The scoped IConfigureOptions need to run per call, that's by design and can use arbitrary scoped services to change the options value.

To be honest, I don't like that IOptionsSnapshot<T> has this hidden by-design difference from IOptionsMonitor<T> very much. I think it would be much simpler to understand if it was just a more limited version of IOptionsMonitor<T>. Just something that you use if you don't need the full monitoring features, not something that sometimes applies changes that even the supposedly 'more powerful" IOptionsMonitor<T> doesn't detect. But on the other hand, I cannot tell if you would be willing to accept a breaking change like that (based on the reverted PR, probably not?).

If it's really by design, it should at least be documented (maybe we can contribute with a documentation update PR?) as this behavior is not obvious.

deeprobin commented 2 years ago

Have you profiled where the performance bottle-neck is?

deeprobin commented 2 years ago

Some micro-optimization/workaround would be not using anonymous methods, because they are currently not optimized by jit [^1]. https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/Microsoft.Extensions.Options/src/OptionsManager.cs#L46

[^1]: Also some case not using delegates: https://github.com/dotnet/runtime/issues/61086

deeprobin commented 2 years ago

OptionsManager.Get seems fine...

image

The bottleneck is the ConcurrentDictionary

tompazourek commented 2 years ago

@deeprobin I think the bottleneck is this: #33954. And if we use IOptionsSnapshot, it runs on every request. If we use IOptionsMonitor, it runs once, and then it's cached until the configuration is changed (and reloaded). I encourage you to try this with IConfiguration that contains several thousand key-value pairs (and nesting, etc.)

davidfowl commented 2 years ago

@tompazourek is right. The problem is the rebinding per request.

Blackclaws commented 2 years ago

There is also a difference in behavior if your code doesn't properly track what causes configuration to change.

You can write something like this:

class TestOptionsConfigure : IConfigureOptions<TestOptions>
{
    private IConfiguration _root;
    private ILogger<TestOptionsConfigure> _logger;
    public TestOptionsConfigure(IConfiguration root, ILogger<TestOptionsConfigure> logger)
    {
        _root = root;
        _logger = logger;
        _logger.LogError("Constructed");
    }

    public void Configure(TestOptions options)
    {
        var sec = _root.GetSection("Test");
        if (sec != null)
        {
            options.Val = sec["Value"];
        }
        _logger.LogError("Configured");
    }
}

If you add this to the service collection, it will get called each time for IOptionsSnapshot but only once for IOptionsMonitor, even if you change the configuration underneath.

IOptionsMonitor only gets notified of changes if you actually bind configuration on adding options, not if you just use IConfigureOptions classes.

profet23 commented 1 year ago

Has there been any movement on this?

This is a significant pain point.

My naïve take is that whatever is building the IOptionSnapshots should just use IOptionsMonitor internally and only re-bind when a change event occurs. So at least then MOST requests wouldn't have to rebind.

dmitriyse commented 1 year ago

@profet23, IOptionsSnapshot also is used for the scenario when each request (scope) can have different version (state) of the options. That why it was decided to be always re-populated from IConfiguration on each request. So the mapping logic can respect the context of the request (scope).

dmitriyse commented 1 year ago

Yes it would be great to have the official support of both: per-context state + caching (probably with custom key calculated from the request). And yest the cache should be invalidated once the configuration changes.

davidfowl commented 1 year ago

IOptionsSnapshot can't use IOptionsMonitor. In fact, we tried to do that in .NET 7 and it is a major breaking change. IOptionsSnapshot supports injecting scoped IConfigureOptions<T>. That's the problem.

The combination of IOptionsSnapshot with uncached configuration binding is a performance pit of failure worth investigating. Specifically, the problem is that calling:

davidfowl commented 1 year ago

To do anything useful here we'd need something like the API I proposed in https://github.com/dotnet/runtime/issues/36130#issuecomment-717910325. This is essentially what the new configuration source generator does today so it's possible that is a reasonable mitigation for this.

tompazourek commented 1 year ago

To do anything useful here we'd need something like the API I proposed in #36130 (comment). This is essentially what the new configuration source generator does today so it's possible that is a reasonable mitigation for this.

This is a similar solution that I implemented as a workaround for issue in our code. I ended up writing an extension to provide a custom binder, and wrote some binders by hand that were written for the exact types used. It performed very well.

A source generator that will generate similar code as the one I wrote by hand sounds like a very good solution to me.

profet23 commented 1 year ago

@davidfowl I understand that this isn't a one size all solution and that this does not behave 1:1 with the current implementation of IOptionsSnapshot, but I am curious how much would break with this:

    public static IServiceCollection AddFastOptions(this IServiceCollection services)
    {
        var descriptor = services.Single(descriptor => descriptor.ServiceType == typeof(IOptionsSnapshot<>));
        services.Remove(descriptor);

        services.AddScoped(typeof(IOptionsSnapshot<>), typeof(FastOptionsSnapshot<>));

        return services;
    }

    public class FastOptionsSnapshot<TOptions> : IOptionsSnapshot<TOptions> where TOptions : class
    {
        private readonly IOptionsMonitor<TOptions> monitor;
        private readonly ConcurrentDictionary<string, TOptions> namedValuesDictionary = new ConcurrentDictionary<string, TOptions>();

        public FastOptionsSnapshot(IOptionsMonitor<TOptions> monitor)
        {
            this.monitor = monitor;
            this.Value = monitor.CurrentValue;
        }

        public TOptions Value { get; }

            public TOptions Get(string name)
            {
                name ??= Options.DefaultName;

                namedValuesDictionary.TryAdd(name, monitor.Get(name));

                return namedValuesDictionary[name];
            }
    }

Again, understanding that this maintains the same TOptions for each scope lifetime. But does not maintain that the entire configuration is the same for each scope.

That said, this is a significant performance boost. Between 200 and 600ms per request (for my use case).

EDIT: Updated the non relevant Dictionary snippet because people keep talking about it, and not the actual issue.

deeprobin commented 1 year ago

@profet23 Your FastOptionsSnapshot uses Dictionary instead of ConcurrentDictionary. However, you are locking the dictionary and therefore I don't think there should be a problem from a concurrency point of view.

I just wonder if the tide turns with the total number of options. This should be analyzed.

davidfowl commented 1 year ago

https://github.com/dotnet/runtime/pull/56271

KalleOlaviNiemitalo commented 1 year ago

@profet23, your code lets a thread read the dictionary without locking, while another thread is writing. That combination is not safe for System.Collections.Generic.Dictionary\<TKey, TValue>, although it is safe for System.Collections.Hashtable.

profet23 commented 1 year ago

56271

@davidfowl Are there any linkable examples of the following?

I think we're going to have to revert this PR. It breaks scoped services contributing to the value of options snapshots (I now see the tests were changed here to account for those changes).

davidfowl commented 1 year ago

This deleted line needs to continue working.

profet23 commented 1 year ago

This isn't a fix, but more of a workaround for the current "performance pit of failure". This implementation is aware that using IOptionsMonitor is not possible for scoped IConfiguredOptions.

public static IServiceCollection AddFastOptions(this IServiceCollection services)
{
    var descriptor = services.Single(descriptor => descriptor.ServiceType == typeof(IOptionsSnapshot<>));
    services.Remove(descriptor); 
    services.Add(new ServiceDescriptor(descriptor.ImplementationType!, descriptor.ImplementationType!, descriptor.Lifetime));

    services.AddScoped(typeof(IOptionsSnapshot<>), typeof(FastOptionsSnapshot<>));

    return services;
}

public class FastOptionsSnapshot<TOptions> : IOptionsSnapshot<TOptions> where TOptions : class
{
    private readonly IServiceProvider serviceProvider;
    private readonly IOptionsMonitor<TOptions>? monitor;
    private readonly ConcurrentDictionary<string, TOptions> namedValuesDictionary = new ConcurrentDictionary<string, TOptions>();

    public FastOptionsSnapshot(IServiceProvider serviceProvider)
    {
        this.serviceProvider = serviceProvider;

        try
        {
            monitor = serviceProvider.GetService(typeof(IOptionsMonitor<TOptions>)) as IOptionsMonitor<TOptions>;
        }
        catch (InvalidOperationException)
        {
            // Swallow the exception and continue without the monitor.
            // This means that the type contains at least one scoped option and we'll need to fall back to OptionsManager (slow) later.
        }
    }

    public TOptions Value => Get(null);

    public TOptions Get(string? name)
    {
        name ??= Options.DefaultName;

        var value =  monitor?.Get(name) ?? ((OptionsManager<TOptions>) serviceProvider.GetRequiredService(typeof(OptionsManager<TOptions>))).Get(name);

        namedValuesDictionary.TryAdd(name, value);

        return namedValuesDictionary[name];
    }
}

Idea being, use the IOptionsMonitor value when possible, but fallback to the existing OptionsManager (slow) when a scoped option exists.

Again, this is a bit hacky, but the performance of the default OptionsManager is such that I think it is worth the added complexity.

davidfowl commented 1 year ago

Yes it’s a workaround that we can’t ship but you can use in your apps if they don’t hit this scenario

profet23 commented 1 year ago

Yes it’s a workaround that we can’t ship but you can use in your apps if they don’t hit this scenario

Understood.

But if others find it useful, I've published my workaround:

https://github.com/sermo/FastOptions

deeprobin commented 1 year ago

Is there anything we can optimize in ConcurrentDictionary?

The code looks quite complex. But probably complex enough to support access from multiple threads.

davidfowl commented 1 year ago

@deeprobin what is it that you’re attempting to fix?

renrutsirhc commented 6 months ago

@NinoFloris @davidfowl How slow is very slow? I'm currently troubleshooting a performance issue in an application that seems to point to IOptionsSnapshot being passed into the InvokeAsync method in a custom middleware component. We are seeing Waiting times of 30 seconds or more for Framework/Library Lazy`1.CreateValue in application insights profiler on some requests while others sail through just fine. Could this performance issue be attributed to this issue with IOptionsSnapshot or is this orders of magnitude slower than what has been discussed here previously?

davidfowl commented 5 months ago

This came up again today and to move this forward without breaking changes is to Iintroduce IOptionsMonitorSnapshot\<T> that was what the original fix by @NinoFloris had. This new type would be scoped and would read the current value from IOptionsMonitor<T>. This solves the performance issues with IOptionsSnapshot because it doesn't do anything but cache the last value (which itself is cached and only changes when config changes).

The biggest performance challenge is the mixing IOptionSnapshot<T> with any expensive operation in IConfigureOptions<T>. The most common problem is running configuration binding per scope (usually per request in asp.net core). We want to cache the configuration that was bound until it changes, applying those cached values to the object. This is a difficult to do efficiently and would require codegen to make it fast (basically generating left hand right handcode to copy values):

class CachedConfigureFromConfiguration<T>(IConfiguration configuration) : IConfigureOptions<T> where T : class
{
    private T? _cached;

    public void Configure(T options)
    {
        // Read the cached value
        var cache = _cached;

        if (cache is null)
        {
            // If it's null, bind the configuration to a new instance and cache it
            _cached = Activator.CreateInstance<T>();

            ConfigurationBinder.Bind(configuration, _cached);

            cache = _cached;

            // Register a change callback to clear the cache
            configuration.GetReloadToken().RegisterChangeCallback(static t =>
            {
                // Clear the cache when the configuration changes
                ((CachedConfigureFromConfiguration<T>)t!)._cached = null;
            }, 
            this);
        }

        ApplyFromCached(cache, options);
    }

    private void ApplyFromCached(T cache, T options)
    {
        // Copy the cached options to the provided options instance
        // this would need to match the rules of what configuration binding would have done
    }
}
WAcry commented 2 months ago

Performance Comparison Between IOptionsSnapshot and IOptionsMonitor with Deep Cloning

We've obviously had a long discussion here. If I understand correctly, ensuring that each scope's configurations remain independent—allowing modifications in different scopes—makes it challenging to have caching. However, I want to raise something new here.

Issue:

In our distributed .NET application, using IOptionsSnapshot to handle complex configurations for each user (for A/B tests, so each request has a isolated settings and we overwrite them with different experiment config overwrites) has led to performance bottlenecks. Profiling revealed that configuration binding was consuming approximately 16% of the total CPU workload.

Key Observation:

Despite our assumption that IOptionsSnapshot's performance should be comparable to deserializing the configuration from a string for each request, this was not the case in fact. Simplified benchmark test results showed a 10x gap in cpu and memory allocations (see code at the bottom), and in our production environment with more complex configurations, the gap widened to over 100x.

Specifically, we benchmarked IOptionsSnapshot against a custom solution that used IOptionsMonitor and deep cloning of the configuration for each request (using serialization and deserialization, which is clearly not optimal and only for testing purposes):

Method Mean Error StdDev Gen0 Allocated
UseOptionsSnapshot 223.15 us 4.399 us 4.115 us 10.0098 123.78 KB
CloneOptionsMonitor 23.23 us 0.453 us 0.484 us 0.9155 11.49 KB

Solution:

We switched to using IOptionsMonitor, and performed deep cloning with a third-party library (DeepCloner). This reduced real CPU workload by 16% and memory allocation by 12%. The configuration-related CPU and memory allocation overheads were reduced by hundreds of times. The one line code change was as follows:

services.AddScoped<IServiceContext>(sp => new ServiceContext()
{
    Logger = sp.GetRequiredService<IRequestLevelLogger>(),
-   Settings = sp.GetRequiredService<IOptionsSnapshot<Settings>>().Value,
+   Settings = sp.GetRequiredService<IOptionsMonitor<Settings>>()?.CurrentValue.DeepClone(),
});

I believe these two lines are actually behave the same way, cloning and create a isolated scope level configuration which can be changed and will only have impacts inside the scope. I understand the runtime lacks a generic deep clone implementation, but isn't the performance gap of IOptionsSnapshot so extreme? I believe there must be some room for improvement. If it's impossible, maybe we should just never use IOptionsSnapshot in any cases. The configuration builder bind methods are too expensive. Any thoughts on this?

Benchmark code:

using System.Text.Json;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;

namespace ConfigBenchmarks;

public class ComplexConfig
{
    public string Name { get; set; }
    public int Value { get; set; }
    public List<SubConfig> SubConfigs { get; set; }
    public Dictionary<string, NestedConfig> NestedConfigs { get; set; }
    public List<ExtraNestedConfig> ExtraNestedConfigs { get; set; }

    public ComplexConfig? Clone()
    {
        var json = JsonSerializer.Serialize(this);
        return JsonSerializer.Deserialize<ComplexConfig>(json);
    }
}

public class SubConfig
{
    public string Key { get; set; }
    public double Amount { get; set; }

    public SubConfig? Clone()
    {
        var json = JsonSerializer.Serialize(this);
        return JsonSerializer.Deserialize<SubConfig>(json);
    }
}

public class NestedConfig
{
    public Guid Id { get; set; }
    public DateTime Timestamp { get; set; }
    public List<string> Tags { get; set; }
    public Dictionary<string, int> AdditionalData { get; set; }

    public NestedConfig? Clone()
    {
        var json = JsonSerializer.Serialize(this);
        return JsonSerializer.Deserialize<NestedConfig>(json);
    }
}

public class ExtraNestedConfig
{
    public string Category { get; set; }
    public List<NestedConfig> NestedItems { get; set; }
    public SubConfig DefaultSubConfig { get; set; }

    public ExtraNestedConfig? Clone()
    {
        var json = JsonSerializer.Serialize(this);
        return JsonSerializer.Deserialize<ExtraNestedConfig>(json);
    }
}

[MemoryDiagnoser]
public class ConfigPerformanceBenchmark
{
    private IServiceProvider _serviceProvider;
    private IConfiguration _configuration;
    private IOptionsMonitor<ComplexConfig> _optionsMonitor;

    [GlobalSetup]
    public void Setup()
    {
        var configDictionary = new Dictionary<string, string>
        {
            ["Name"] = "TestConfig",
            ["Value"] = "42",
            ["SubConfigs:0:Key"] = "SubKey1",
            ["SubConfigs:0:Amount"] = "10.5",
            ["SubConfigs:1:Key"] = "SubKey2",
            ["SubConfigs:1:Amount"] = "20.7",
            ["NestedConfigs:Nested1:Id"] = Guid.NewGuid().ToString(),
            ["NestedConfigs:Nested1:Timestamp"] = DateTime.Now.ToString("o"),
            ["NestedConfigs:Nested1:Tags:0"] = "Tag1",
            ["NestedConfigs:Nested1:Tags:1"] = "Tag2",
            ["NestedConfigs:Nested2:Id"] = Guid.NewGuid().ToString(),
            ["NestedConfigs:Nested2:Timestamp"] = DateTime.Now.AddDays(1).ToString("o"),
            ["NestedConfigs:Nested2:Tags:0"] = "Tag3",
            ["NestedConfigs:Nested2:Tags:1"] = "Tag4",

            ["SubConfigs:2:Key"] = "SubKey3",
            ["SubConfigs:2:Amount"] = "30.9",
            ["SubConfigs:2:Key"] = "SubKey3",
            ["SubConfigs:2:Amount"] = "30.9",
            ["SubConfigs:3:Key"] = "SubKey4",
            ["SubConfigs:3:Amount"] = "40.1",

            ["NestedConfigs:Nested3:Id"] = Guid.NewGuid().ToString(),
            ["NestedConfigs:Nested3:Timestamp"] = DateTime.Now.AddDays(2).ToString("o"),
            ["NestedConfigs:Nested3:Tags:0"] = "Tag5",
            ["NestedConfigs:Nested3:Tags:1"] = "Tag6",
            ["NestedConfigs:Nested3:AdditionalData:Key1"] = "100",
            ["NestedConfigs:Nested3:AdditionalData:Key2"] = "200",
            ["NestedConfigs:Nested4:Id"] = Guid.NewGuid().ToString(),
            ["NestedConfigs:Nested4:Timestamp"] = DateTime.Now.AddDays(3).ToString("o"),
            ["NestedConfigs:Nested4:Tags:0"] = "Tag7",
            ["NestedConfigs:Nested4:Tags:1"] = "Tag8",
            ["NestedConfigs:Nested4:AdditionalData:Key1"] = "300",
            ["NestedConfigs:Nested4:AdditionalData:Key2"] = "400",

            ["ExtraNestedConfigs:0:Category"] = "CategoryA",
            ["ExtraNestedConfigs:0:NestedItems:0:Id"] = Guid.NewGuid().ToString(),
            ["ExtraNestedConfigs:0:NestedItems:0:Timestamp"] = DateTime.Now.ToString("o"),
            ["ExtraNestedConfigs:0:NestedItems:0:Tags:0"] = "Tag1",
            ["ExtraNestedConfigs:0:NestedItems:0:Tags:1"] = "Tag2",
            ["ExtraNestedConfigs:0:NestedItems:0:AdditionalData:Key1"] = "100",
            ["ExtraNestedConfigs:0:NestedItems:0:AdditionalData:Key2"] = "200",
            ["ExtraNestedConfigs:0:DefaultSubConfig:Key"] = "DefaultKey1",
            ["ExtraNestedConfigs:0:DefaultSubConfig:Amount"] = "10.5",

            ["ExtraNestedConfigs:1:Category"] = "CategoryB",
            ["ExtraNestedConfigs:1:NestedItems:0:Id"] = Guid.NewGuid().ToString(),
            ["ExtraNestedConfigs:1:NestedItems:0:Timestamp"] = DateTime.Now.AddDays(1).ToString("o"),
            ["ExtraNestedConfigs:1:NestedItems:0:Tags:0"] = "Tag3",
            ["ExtraNestedConfigs:1:NestedItems:0:Tags:1"] = "Tag4",
            ["ExtraNestedConfigs:1:NestedItems:0:AdditionalData:Key1"] = "300",
            ["ExtraNestedConfigs:1:NestedItems:0:AdditionalData:Key2"] = "400",
            ["ExtraNestedConfigs:1:DefaultSubConfig:Key"] = "DefaultKey2",
            ["ExtraNestedConfigs:1:DefaultSubConfig:Amount"] = "20.7",

            ["ExtraNestedConfigs:2:Category"] = "CategoryC",
            ["ExtraNestedConfigs:2:NestedItems:0:Id"] = Guid.NewGuid().ToString(),
            ["ExtraNestedConfigs:2:NestedItems:0:Timestamp"] = DateTime.Now.AddDays(2).ToString("o"),
            ["ExtraNestedConfigs:2:NestedItems:0:Tags:0"] = "Tag5",
            ["ExtraNestedConfigs:2:NestedItems:0:Tags:1"] = "Tag6",
            ["ExtraNestedConfigs:2:NestedItems:0:AdditionalData:Key1"] = "500",
            ["ExtraNestedConfigs:2:NestedItems:0:AdditionalData:Key2"] = "600",
            ["ExtraNestedConfigs:2:DefaultSubConfig:Key"] = "DefaultKey3",
            ["ExtraNestedConfigs:2:DefaultSubConfig:Amount"] = "30.9",

            ["ExtraNestedConfigs:3:Category"] = "CategoryD",
            ["ExtraNestedConfigs:3:NestedItems:0:Id"] = Guid.NewGuid().ToString(),
            ["ExtraNestedConfigs:3:NestedItems:0:Timestamp"] = DateTime.Now.AddDays(3).ToString("o"),
            ["ExtraNestedConfigs:3:NestedItems:0:Tags:0"] = "Tag7",
            ["ExtraNestedConfigs:3:NestedItems:0:Tags:1"] = "Tag8",
            ["ExtraNestedConfigs:3:NestedItems:0:AdditionalData:Key1"] = "700",
            ["ExtraNestedConfigs:3:NestedItems:0:AdditionalData:Key2"] = "800",
            ["ExtraNestedConfigs:3:DefaultSubConfig:Key"] = "DefaultKey4",
            ["ExtraNestedConfigs:3:DefaultSubConfig:Amount"] = "40.1"
        };

        _configuration = new ConfigurationBuilder()
            .AddInMemoryCollection(configDictionary)
            .Build();

        var services = new ServiceCollection();
        services.AddOptions();
        services.Configure<ComplexConfig>(config => { _configuration.Bind(config); });
        _serviceProvider = services.BuildServiceProvider();
        _optionsMonitor = _serviceProvider.GetRequiredService<IOptionsMonitor<ComplexConfig>>();
    }

    [Benchmark]
    public int UseOptionsSnapshot()
    {
        using var scope = _serviceProvider.CreateScope();
        var snapshot = scope.ServiceProvider.GetRequiredService<IOptionsSnapshot<ComplexConfig>>();
        return snapshot.Value.GetHashCode();
    }

    [Benchmark]
    public int CloneOptionsMonitor()
    {
        using var scope = _serviceProvider.CreateScope();
        return _optionsMonitor.CurrentValue.Clone()!.GetHashCode() + scope.GetHashCode();
    }
}

public static class Program
{
    public static void Main(string[] args)
    {
        var summary = BenchmarkRunner.Run<ConfigPerformanceBenchmark>();
    }
}
eerhardt commented 2 months ago

Profiling revealed that configuration binding was consuming approximately 16% of the total CPU workload.

I'm curious if you tried using the configuration binder source generator? What impact would it have on your scenario? It should make configuration binding much faster since it doesn't use reflection, and instead sets the object's properties directly via generated code.

WAcry commented 2 months ago

Profiling revealed that configuration binding was consuming approximately 16% of the total CPU workload.

I'm curious if you tried using the configuration binder source generator? What impact would it have on your scenario? It should make configuration binding much faster since it doesn't use reflection, and instead sets the object's properties directly via generated code.

The ConfigurationBinderSourceGenerator seems really cool! Unfortunately, this service is still using .NET 6 due to some issues that arose during our recent upgrade to .NET 8, so I'm unable to test the actual difference in a production environment. But I can see from the benchmark above that it consumes only half the CPU compared to the previous approach. While it's not yet the default in .NET 8, I think that code gen sounds one right direction for optimization.

Method Mean Error StdDev Gen0 Allocated
UseOptionsSnapshot 216.80 us 2.850 us 5.757 us 0.2441 121.05 KB
UseOptionsSnapshotWithCodeGen 112.15 us 1.426 us 3.131 us - 69.35 KB
CloneOptionsMonitor 22.83 us 0.334 us 0.279 us 0.0305 11.34 KB

However, when the configs become more complex [example], the performance difference consistently remains around 2x, regardless of the complexity (may not be true for different config structure.) As mentioned earlier, even cloning using JsonSerializer for serialization and deserialization provides dozens of times the performance improvement. As the configuration complexity increases, the performance gap widens. In other words, current code gen still cannot address the performance disparity, which can be hundreds of times greater.

julealgon commented 1 month ago

@WAcry

However, when the configs become more complex [example], the performance difference consistently remains around 2x, regardless of the complexity.

That example you used for the benchmark seems quite extreme in terms of complexity. Do you mind me asking what exactly are you modelling in the real application that requires such deep configuration structure with that many elements? I know this doesn't solve the slowdown or anything, just curious as I don't think I've ever needed such elaborate options object graph before. I wonder if your design is flawed here to be honest.

profet23 commented 1 month ago

@WAcry

However, when the configs become more complex [example], the performance difference consistently remains around 2x, regardless of the complexity.

That example you used for the benchmark seems quite extreme in terms of complexity. Do you mind me asking what exactly are you modelling in the real application that requires such deep configuration structure with that many elements? I know this doesn't solve the slowdown or anything, just curious as I don't think I've ever needed such elaborate options object graph before. I wonder if your design is flawed here to be honest.

I don't think it's out of the ordinary to have some pretty complicated configuration files. It's common to utilize several configuration sources. And Microsoft encourages usage of configuration for more than just IOptions.

For instance Microsoft feature management (feature flags) relies on it: https://learn.microsoft.com/en-us/azure/azure-app-configuration/feature-management-dotnet-reference

WAcry commented 1 month ago

In our production environment, the configuration isn’t as large as in the example above. The significant optimization in production (> 100x) is due to the absence of Config Code Gen by default, the fact that the service still uses .NET 6, and our use of DeepCloner instead of serialization and deserialization (which can be much faster in many common cases). Btw, the example above has thousands of lines, which may be rare, but it is not "deep" (only 3-4 levels).

I can provide a simpler example to show the issue (with enabled config code gen). Consider a configuration containing only int properties, without any custom types. Below are the benchmark results for configurations with 1,000 int properties and 100 int properties.

Method Mean Error StdDev Gen0 Allocated
SimplePropsConfig_UseOptionsSnapshotCodeGen1000 398,597.8 ns 7,931.24 ns 18,381.86 ns - 86480 B
SimplePropsConfig_CloneOptionsMonitorJson1000 231,247.1 ns 4,322.07 ns 4,042.87 ns 0.2441 71204 B
SimplePropsConfig_CloneOptionsMonitorDeepClone1000 555.4 ns 11.03 ns 22.77 ns 0.0172 4280 B
SimplePropsConfig_UseOptionsSnapshotCodeGen100 24,259.2 ns 474.20 ns 805.23 ns 0.0305 10872 B
SimplePropsConfig_CloneOptionsMonitorJson100 12,226.4 ns 62.98 ns 58.91 ns 0.0153 4576 B
SimplePropsConfig_CloneOptionsMonitorDeepClone100 179.6 ns 3.57 ns 4.25 ns 0.0026 680 B

In this case, the performance difference between configuration binding with code gen and json serialization isn't significant. However, json serialization is not well-suited for deep cloning. Using real deep cloning shows a performance improvement of hundreds of times. I think maybe we can use code gen to do the deep clone for us.