Open eiriktsarpalis opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-text-json See info in area-owners.md if you want to be subscribed.
Author: | eiriktsarpalis |
---|---|
Assignees: | - |
Labels: | `area-System.Text.Json`, `User Story`, `Priority:2`, `Cost:M`, `Team:Libraries` |
Milestone: | 7.0.0 |
Tagging @steveharter who might be interested in this.
I'm updating this feature's milestone to Future as it is not likely to make it into .NET 7.
I've updated the issue description and examples to broaden the scope to stateful converters. Sketch of API proposal and a basic example has been included. PTAL.
There are scenarios where a custom converter might want metadata info about the type (including it's properties) or property it is processing, e.g. https://github.com/dotnet/runtime/issues/35240#issuecomment-617453603. I understand that with JsonSerializerOptions.GetTypeInfo(Type)
, it is now possible to retrieve type metadata within converters, but should there be a first class mechanism, e.g. adding relevant type/property metadata to the state object passed to converters?
I believe I've also come across scenarios where a converter might want to know where in the object graph it is being invoked, i.e the root object vs property values. Is that also state that should be passed?
Yes that's plausible. Effectively we should investigate what parts of ReadStack/WriteStack could/should be exposed to the user. I'm not sure if we could meaningfully expose what amounts to WriteStack.Current, since that only gets updated consistently when the internal converters are being called. Prototyping is certainly required to validate feasibility.
To be honest, with the proposed design, this feature would only be moderately useful, because it requires passing state explicitly to the Serialize/Deserialize methods. This doesn't help when you don't have control over these callsites, as is the case in ASP.NET Core JSON input/output formatters.
I realize what I'm talking about isn't really "state", at least not callsite-specific state, but it's what was requested in #71718, which has been closed in favor this issue... I don't think the proposed design addresses the requirements of #71718.
I realize what I'm talking about isn't really "state", at least not callsite-specific state, but it's what was requested in https://github.com/dotnet/runtime/issues/71718, which has been closed in favor this issue... I don't think the proposed design addresses the requirements of https://github.com/dotnet/runtime/issues/71718.
Yes, this proposal specifically concerns state scoped to the operation rather than the options instance. The latter is achievable if you really need it whereas the former is outright impossible.
@eiriktsarpalis I must admit that I'd not read this through and assumed it solved the suggestion from my original proposal. I agree that this solves a different problem (which I've not personally run into). It would not help at all with the scenario I have when building a library. Is there any reason not to re-open #71718 to complement this? I think both are valid scenarios that should be possible to achieve with less ceremony.
Agree, I've reopened the issue based on your feedback.
"We might want to consider the viability attaching the state values as properties on Utf8JsonWriter and Utf8JsonReader" I see when Deserialize from Stream, an Utf8JsonReader instance is made for every iteration of a while-loop, so does not seem like a good fit.
Was just looking for something like this. I'd like to be able to extract some value from the object during serialization and make it available after the serialization is done. It seems like currently the only way to achieve this is to be instantiating a new converter for each serialization, which is not great for perf.
Is this still planned to be implemented at some point?
Personally, I'm more interested in the performance benefits of the resuming bits of this proposal. While I realize they're somewhat related, I wonder if the lack of progress here could be partially related to the scope of including both resuming and user state in the same proposal. Should it be separated so the more valuable one (whichever that is) could be done independently, so long as the design has a path forward to the other?
Should it be separated so the more valuable one (whichever that is) could be done independently, so long as the design has a path forward to the other?
I think both are equally valuable. At the same time, we should be designing the serialization state types in a way that satisfy the requirements for both use cases.
What is the purpose of the JsonReadState
and JsonWriteState
parameters being passed in as ref?
The dictionary is already mutable so it's probably not that...
The types already exist as internal implementation detail and hold additional state which wouldn't be visible to users.
Given that #64182 was closed in deference to this issue, I don't know why the solution needs to involve async converters when DeserializeAsyncEnumerable does not and yet provides significant memory savings over DeserializeAsync (whenever streaming deserialization is appropriate). It seems the ability to call DeserializeAsyncEnumerable at any arbitrary point of an incoming stream, and leaving the stream open at the appropriate byte location upon exit, would be immensely useful. While the solutions discussed above are optimal from a resource perspective, forcing all developers to fully support async converters seems like a bigger lift than necessary, simply to support non-root array deserialization. If an overload can be implemented to achieve this, then once async converters become available, developers can opt in to implement them as appropriate.
Maybe it's not important, but issue #77018 (that tracks this one) is currently closed. I don't see it being tracked by an equivalent issue for .NET 9, but since it's days (hours?) away from release, it probably needs to be tracked by the equivalent for .NET 10.
Background and Motivation
The current model for authoring custom converters in System.Text.Json is general-purpose and powerful enough to address most serialization customization requirements. Where it falls short currently is in the ability to accept user-provided state scoped to the current serialization operation; this effectively blocks a few relatively common scenaria:
Custom converters requiring dependency injection scoped to the current serialization. A lack of a reliable state passing mechanism can prompt users to rebuild the converter cache every time a serialization operation is performed.
Custom converters do not currently support streaming serialization. Built-in converters avail of the internal "resumable converter" abstraction, a pattern which allows partial serialization and deserialization by marshalling the serialization state/stack into a state object that gets passed along converters. It lets converters suspend and resume serialization as soon as the need to flush the buffer or read more data arises. This pattern is implemented using the internal
JsonConveter<T>.TryWrite
andJsonConverter<T>.TryRead
methods.Since resumable converters are an internal implementation detail, custom converters cannot support resumable serialization. This can create performance problems in both serialization and deserialization:
We should consider exposing a variant of that abstraction to advanced custom converter authors.
Custom converters are not capable of passing internal serialization state, often resulting in functional bugs when custom converters are encountered in an object graph (cf. https://github.com/dotnet/runtime/issues/51715, #67403, https://github.com/dotnet/runtime/issues/77345)
Proposal
Here is a rough sketch of how this API could look like:
Users should be able to author custom converters that can take full advantage of async serialization, and compose correctly with the contextual serialization state. This is particularly important in the case of library authors, who might want to extend async serialization support for custom sets of types. It could also be used to author top-level async serialization methods that target other data sources (e.g. using System.IO.Pipelines cf. https://github.com/dotnet/runtime/issues/29902)
Usage Examples
Alternative designs
We might want to consider the viability attaching the state values as properties on
Utf8JsonWriter
andUtf8JsonReader
. It would avoid the need of introducing certain overloads, but on the flip side it could break scenaria where the writer/reader objects are being passed to nested serialization operations.Goals
Stream
(à la https://github.com/dotnet/runtime/issues/29902).Progress