Closed SteveSandersonMS closed 2 years ago
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.
Author: | SteveSandersonMS |
---|---|
Assignees: | - |
Labels: | `area-System.Text.Json` |
Milestone: | - |
Also, if this caching behavior was introduced in 7.0, then it could be argued that it's a regression. (Subjective, I know.)
Is the actual problem the shortcut described in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializerOptions.Caching.cs#L165-L167? Should this code rather use ConditionalWeakTable to avoid unnecessarily keeping the keys around?
Thanks @SteveSandersonMS, your analysis is accurate. At the time when the cache was implemented, we debated as to whether there should be a timer that ran evictions in the background but ultimately found that approach to be unsavory. In hindsight, I had not anticipated that user configuration in JsonSerializerOptions
would realistically capture a large amount of memory but I guess I should have known better :-)
That being said, it is almost certain that the application in question is creating single-use JsonSerializerOptions
instances, a known anti-pattern that would have caused performance problems even in .NET 6. The author of the original issue could almost certainly address the problem by creating singleton options instances. We're planning on adding an analyzer in the future which should help users avoid that particular anti-pattern.
Also, if this caching behavior was introduced in 7.0, then it could be argued that it's a regression. (Subjective, I know.)
I think we could definitely improve the cache based on this feedback but I don't necessarily think this requires a .NET 7 fix. I think we should wait on more user feedback in case it arrives.
Is the actual problem the shortcut described in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializerOptions.Caching.cs#L165-L167? Should this code rather use ConditionalWeakTable to avoid unnecessarily keeping the keys around?
Original prototypes did use ConditionalWeakTable, effectively performing a linear search over all JsonSerializerOptions
instances in the process:
But IIRC enumeration performance for the particular type was bad enough that it almost invalidated any performance benefits gained from caching.
That being said, it is almost certain that the application in question is creating single-use JsonSerializerOptions instances, a known anti-pattern that would have caused https://github.com/dotnet/runtime/issues/40072. The author of the original issue could almost certainly address the problem by creating singleton options instances.
I think the original author was using an unmodified (except for the ability to manually trigger GC) Blazor template to demonstrate the problem.
That being said, it is almost certain that the application in question is creating single-use
JsonSerializerOptions
instances, a known anti-pattern that would have caused performance problems even in .NET 6. The author of the original issue could almost certainly address the problem by creating singleton options instances. We're planning on adding an analyzer in the future which should help users avoid that particular anti-pattern.
Hey @eiriktsarpalis Just in case you couldn't read the complete original issue (I'm the author) and just for clarification, the application in question which you are referring to is the Blazor framework itself using the default Blazor Server template. So, if there're problems (like the anti-patterns you mentioned), they are in .NET 7 RC1 itself and its default templates, not user code.
Regards!
Regarding what @SteveSandersonMS mentioned about a timer, you can configure the retention period of circuits for Blazor, for example:
builder.Services.AddServerSideBlazor(options => {
options.DisconnectedCircuitMaxRetained = 5;
options.DisconnectedCircuitRetentionPeriod = TimeSpan.FromSeconds(10);
});
So maybe this should be tied somehow with the serializer cache clearing call, I guess it wouldn't make much sense to configure different amounts of time for each since one depends on the other to be useful. I mean, the Blazor App could disconnect forcibly all the users and release their circuits, but if the instances are being held by the serializer cache, memory won't be released, which is the intention.
Regards!
Hi @chrdlx, that's interesting. If there are indeed keys in the cache that require eviction that would suggest to me that Blazor is using temporary options instances, which should be avoided under most circumstances. I'm not too familiar with Blazor internals however, is there anything in its configuration that could prompt single-use instances to be created? Perhaps some options instance only used for initialized purposes that then gets garbage collected? cc @SteveSandersonMS
Hi @eiriktsarpalis ! I guess Steve is the right one for this kind of stuff. I use Blazor to develop Apps but sadly I don't have that deep knowledge about its inner workings to give you a proper answer on how to solve this. Regards!
But IIRC enumeration performance for the ConditionalWeakTable was bad enough that it almost invalidated any performance benefits gained from caching.
For context, here are some benchmarks comparing the current caching strategy with one performing linear traversal on a ConditionalWeakTable:
public static CachingContext GetOrCreate(JsonSerializerOptions options)
{
Debug.Assert(options.IsReadOnly, "Cannot create caching contexts for mutable JsonSerializerOptions instances");
Debug.Assert(options._typeInfoResolver != null);
OptionsEqualityComparer comparer = s_equalityComparer;
foreach (KeyValuePair<JsonSerializerOptions, object?> entry in TrackedOptionsInstances.All)
{
JsonSerializerOptions otherOptions = entry.Key;
if (otherOptions._cachingContext != null && comparer.Equals(options, otherOptions))
{
return otherOptions._cachingContext;
}
}
return new CachingContext(options);
}
Method | Job | Method | Mean | Error | StdDev | Median | Min | Max | Ratio | MannWhitney(3%) | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NewCustomizedOptions | Job-IEKUUC | main | 1.128 us | 0.0618 us | 0.0712 us | 1.117 us | 1.025 us | 1.315 us | 1.00 | Base | 0.00 | 0.0449 | 0.0045 | 0.0045 | 486 B | 1.00 |
NewCustomizedOptions | Job-ODPMMK | ConditionalWeakTable | 39.336 us | 0.7793 us | 0.8974 us | 39.420 us | 37.909 us | 40.930 us | 35.01 | Slower | 2.35 | - | - | - | 544 B | 1.12 |
NewCustomizedOptions | Job-TIQXHW | No Caching | 19.421 us | 0.7321 us | 0.8431 us | 19.189 us | 18.3138 us | 21.025 us | 18.06 | Slower | 1.22 | 0.7392 | 0.0821 | - | 7966 B | 16.39 |
that would suggest to me that Blazor is using temporary options instances, which should be avoided under most circumstances
@eiriktsarpalis Blazor has a JsonSerializerOptions
instance per circuit (user connection), and has done for over 3 years, at least since this commit. It does this because the serialized representation of certain types (DotNetObjectReference
, in this case) is a per-circuit ID, and the act of serializing it for a JS interop call is what causes the instance to become tracked within the circuit. So within the per-circuit options, there is a JsonConverterFactory
that holds a reference to the circuit.
I don't think we were aware that, at a certain point, it became expensive to create JsonSerializerOptions
instances (or was this always the case?).
If you have a recommendation for a different way that a JsonConverter can access contextual objects specific to a particular seralization process, please let us know.
I don't think we were aware that, at a certain point, it became expensive to create JsonSerializerOptions instances (or was this always the case?).
Stephen Toub's 7.0 performance behemoth had some stuff about this:
https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#json
It wouldn't have been intuitive to me that an "Options" class would be massively expensive.
Obviously that article contains the somewhat optimistic promise "... with appropriate removal when no longer used in order to avoid unbounded leaks" which overlooks the old saw about cache eviction being the hardest problem in computer science.
I don't think we were aware that, at a certain point, it became expensive to create JsonSerializerOptions instances (or was this always the case?).
It wouldn't have been intuitive to me that an "Options" class would be massively expensive.
It's been like this since the introduction of the type -- it's essentially the same issue as HttpClient
but more insidious since the name "options" doesn't suggest that the type defines its own reflection metadata caching context.
It's been like this since the introduction of the type
What do you recommend as a solution? It does seem like the 7.0 caching technique change has led to an issue for non-server scenarios. Is there (or should there be) some way to signal that a particular JsonSerializerOptions
instance can be cleared up? It's not disposable. Should it become disposable?
To be honest, I don't think we'd see this as being urgent enough to fix for 7.0 within Blazor since we only have evidence that it affects one person, and they already have a workaround. What we could do in Blazor in 8.0 though is add some logic to our circuit disposal code that makes it null out the reference from the JsonConverterFactory to the circuit, so the circuits at least can be collected even if there are some JsonSerializerOptions/JsonConverterFactory instances left stuck in the cache until more calls happen.
Since this could affect other (non-Blazor) desktop applications that use STJ, it might be worth having a more general solution such as making JsonSerializerOptions disposable, or clearing the cache on a timer, or going back to ConditionalWeakTable. If you think any of those are a possibility please let us know so we could make use of that in 8.0.
I'm curious, what prompted requiring separate JsonSerializerOptions
instances per connection? Presumably it's related to some form of dependency injection on custom converters? Would it be possible to have your converter instances remain fixed per-type application-wide and inject connection-specific state differently?
I know that System.Text.Json doesn't provide a good mechanism for scoping DI on a per-operation basis, here's a user story that attempts to cover this and related requirements: #63795.
I'm curious, what prompted requiring separate JsonSerializerOptions instances per connection?
This is what I mentioned above:
It does this because the serialized representation of certain types (DotNetObjectReference, in this case) is a per-circuit ID, and the act of serializing it for a JS interop call is what causes the instance to become tracked within the circuit. So within the per-circuit options, there is a JsonConverterFactory that holds a reference to the circuit.
Apologies if it's still unclear, but if you can clarify which part of this is unclear I'll try to clarify more!
Would it be possible to have your converter instances remain fixed per-type application-wide and inject connection-specific state differently?
That would be great, but don't know of a way 😄 It's what I was asking with "If you have a recommendation for a different way that a JsonConverter can access contextual objects specific to a particular seralization process, please let us know." #63795 sounds interesting but async conversion isn't an aspect of it for us; we just have to be able to perform a JsonSerializer.Serialize call within some context that allows the converters to access objects within that context (which we achieve today by having a separate JsonConverterFactory for each context).
If you have a recommendation for a different way that a JsonConverter can access contextual objects specific to a particular seralization process, please let us know.
For the case of serialization, state could be encapsulated by the serialized value. I'm not aware of a good way to achieve the same effect in deserialization, unfortunately.
the somewhat optimistic promise "... with appropriate removal when no longer used in order to avoid unbounded leaks"
It does avoid unbounded leaks... unbounded being the key word.
Hi. This behaviour in blazor server is really sth one would not expect. You can easily ramp up memory usage to several gigs, god knows when evicted :) For one it's hard to reason about, like is there a memory leak in the app or what will happen with smaller apps, where you could have many running on one server each easily consuming some gigs of ram...
I guess this behaviour cannot be avoided in 7.0 release?
I guess this behaviour cannot be avoided in 7.0 release?
We might consider backporting https://github.com/dotnet/runtime/pull/76607 in servicing, assuming there is substantial impact in blazor apps.
First, it's totally possible I'm misinterpreting what STJ does here. We got this report saying that a Blazor Server app was holding hundreds of MB in memory for longer than expected, and using the VS memory inspection tools, it looks like at least one (possibly the only) thing pinning this data is rooted in
JsonSerializerOptions
'ss_cache
. The cache usesWeakReference
for values, but the problem is with the keys.It looks like the only thing that clears dangling entries is subsequent calls into the JSON serializer. This is likely sufficient in server scenarios where the server will never stop processing requests for any long duration, and even if it does, people probably don't care about freeing any memory until the server experiences memory pressure. However, for desktop scenarios, an app may stop doing things for a long time, and during that period, people really want it to release memory that's no longer required.
Suggestion
Instead of triggering
TryEvictDanglingEntries
only when there's a call toGetCachingContext
, consider also having some guarantee like "it will always clear dangling entries within 10s even if there are no such incoming calls" or similar.This isn't blocking anything in Blazor directly, but would be a good solution for people using STJ directly in desktop apps, or Blazor Server as an implementation detail of a desktop app (like this person).