Expose instance methods in JsonSerializer

eiriktsarpalis commented 2 years ago

Background & Motivation

The current design of the JsonSerializer class exposes all serialization functionality as static methods accepting configuration parametrically (and optionally). In hindsight, this design has contributed to a few ergonomic problems when using this API:

Forgetting to pass JsonSerializerOptions to the serialization methods:
```
var options = new JsonSerializerOptions { Converters = { new JsonStringEnumConverter() }};
JsonSerializer.Serialize(MyEnum.Value);
```
This is probably the most common issue -- I've personally fallen for this too many times.
Creating a new JsonSerializerOptions instance on each serialization:
```
foreach (MyPoco value in values)
{
     var options = new JsonSerializerOptions  { Converters = { new JsonStringEnumConverter() }};
     JsonSerializer.Serialize(value, options); // recalculates JSON contracts from scratch on each iteration
}
```
Even though this anti-pattern has been explicitly documented as a potential performance bug, users keep falling for it. We've made attempts to mitigate the problem by implementing shared metadata caches but ultimately the underlying issue still affects users. See also #65396 for plans an adding an analyzer in this space.

Pervasive RequiresUnreferenceCode/RequiresDynamicCode annotations: all serialization methods accepting JsonSerializerOptions have been annotated as linker/AOT-unsafe even though legitimate scenaria exist where this is not the case:

 var options = new JsonSerializerOptions { TypeInfoResolver = MySourceGeneratedContext.Default };
 JsonSerializer.Serialize(value, options); // warning: JSON serialization and deserialization might require types that cannot be statically analyzed.

Bifurcation of API surface between reflection and source generation: users need to call into distinct methods depending on whether they use sourcegen or reflection. The distinction between JsonSerializerOptions and JsonSerializerContext has always been tenuous and has been rendered obsolete with the infrastructural changes introduced by https://github.com/dotnet/runtime/issues/63686. Arguably, the source of JSON contracts is a configuration detail that should be dealt with at the composition root and not concern any serialization methods.

API Proposal

This issue proposes we expose instance methods in JsonSerializer (or some different class):

namespace System.Text.Json;

public partial class JsonSerializer // Changed from static class
{
    public JsonSerializerOptions Options { get; }

    public JsonSerializer(JsonSerializerOptions options); // linker-safe constructor, throws if options.TypeInfoResolver == null;
                                                          // we might consider adding a linker-unsafe factory that populates TypeInfoResolver with the reflection resolver, like the existing serialization APIs do.

    [RequiresUnreferencedCode]
    public static JsonSerializer Default { get; } // serializer wrapping JsonSerializerOptions.Default

    /* Serialization APIs */

    public string Serialize<TValue>(TValue value);
    public byte[] SerializeToUtf8Bytes<TValue>(TValue value);
    public void Serialize<TValue>(Utf8JsonWriter utf8Json, TValue value);
    public void Serialize<TValue>(Stream utf8Json, TValue value);
    public void SerializeAsync<TValue>(Stream utf8Json, TValue value);
    public JsonDocument SerializeToDocument<TValue>(TValue value);
    public JsonElement SerializeToElement<TValue>(TValue value);
    public JsonNode SerializeToNode<TValue>(TValue value);

    public string Serialize(object? value, Type inputType);
    public byte[] SerializeToUtf8Bytes(object? value, Type inputType);
    public void Serialize(Utf8JsonWriter writer, object? value, Type inputType);
    public void Serialize(Stream utf8Json, object? value, Type inputType);
    public void SerializeAsync(Stream utf8Json, object? value, Type inputType);
    public JsonDocument SerializeToDocument(object? value, Type inputType);
    public JsonElement SerializeToElement(object? value, Type inputType);
    public JsonNode SerializeToNode(object? value, Type inputType);

    /* Deserialization APIs */

    public TValue? Deserialize<TValue>(string json);
    public TValue? Deserialize<TValue>(ReadOnlySpan<char> json);
    public TValue? Deserialize<TValue>(ReadOnlySpan<byte> utf8Json);
    public TValue? Deserialize<TValue>(ref Utf8JsonReader reader);
    public TValue? Deserialize<TValue>(Stream utf8Json);
    public ValueTask<TValue?> DeserializeAsync<TValue>(Stream utf8Json);
    public IAsyncEnumerable<TValue?> DeserializeAsyncEnumerable<TValue>(Stream utf8Json);
    public TValue? Deserialize<TValue>(JsonDocument document);
    public TValue? Deserialize<TValue>(JsonElement element);
    public TValue? Deserialize<TValue>(JsonNode node);

    public object? Deserialize(ReadOnlySpan<byte> utf8Json, Type returnType);
    public object? Deserialize(ReadOnlySpan<char> json, Type returnType);
    public object? Deserialize(string json, Type returnType);
    public object? Deserialize(JsonDocument document, Type returnType);
    public object? Deserialize(JsonElement element, Type returnType);
    public object? Deserialize(JsonNode node, Type returnType);
    public object? Deserialize(ref Utf8JsonReader reader, Type returnType);
    public object? Deserialize(Stream utf8Json, Type returnType);
    public ValueTask<object?> DeserializeAsync(Stream utf8Json, Type returnType);
}

namespace System.Text.Json.Serialization;

public partial class JsonSerializerContext
{
     public JsonSerializer Serializer { get; }
}

Usage Examples

Using the reflection-based serializer

/* composition root */
var options = new JsonSerializerOptions 
{ 
    Converters = new JsonStringEnumConverter(), 
    TypeInfoResolver = new DefaultJsonTypeInfoResolver() 
};

var serializer = new JsonSerializer(options);

/* usage */
serializer.Serialize(value);

Using the source generator

JsonSerializer serializer = MyContext.Default.Serializer;
serializer.Serialize(new MyPoco()); // Serializes using the source generator 

[JsonSerializable(typeof(MyPoco))]
public partial class MyContext : JsonSerializerContext
{}

Open Questions

Should we reuse the existing JsonSerializer class or introduce a new type?
Should we have an obsoletion plan for static APIs accepting JsonSerializerOptions?

Related to #65396, #31094, #64646

cc @eerhardt @davidfowl @ericstj

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.

Issue Details

## Background & Motivation The current design of the [`JsonSerializer` class](https://docs.microsoft.com/en-us/dotnet/api/system.text.json.jsonserializer?view=net-6.0) exposes all serialization functionality as static methods accepting configuration parametrically (and optionally). In hindsight, this design has contributed to a few ergonomic problems when using this API: 1. Forgetting to pass `JsonSerializerOptions` to the serialization methods: ```C# var options = new JsonSerializerOptions { Converters = { new JsonStringEnumConverter() }}; JsonSerializer.Serialize(MyEnum.Value); ``` This is probably the most common issue -- I've personally fallen for this too many times. 2. Creating a new `JsonSerializerOptions` instance on each serialization: ```C# foreach (MyPoco value in values) { var options = new JsonSerializerOptions { Converters = { new JsonStringEnumConverter() }}; JsonSerializer.Serialize(value, options); // recalculates JSON contracts from scratch on each iteration } ``` Even though this anti-pattern has been [explicitly documented](https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-configure-options?pivots=dotnet-6-0#reuse-jsonserializeroptions-instances) as a potential performance bug, users keep falling for it. We've made attempts to mitigate the problem by implementing [shared metadata caches](https://github.com/dotnet/runtime/pull/64646) but ultimately the underlying issue still affects users. See also #65396 for plans an adding an analyzer in this space. 3. Pervasive `RequiresUnreferenceCode`/`RequiresDynamicCode` annotations: all serialization methods accepting `JsonSerializerOptions` have been annotated as linker/AOT-unsafe even though legitimate scenaria exist where this is not the case: ```C# var options = new JsonSerializerOptions { TypeInfoResolver = MySourceGeneratedContext.Default }; JsonSerializer.Serialize(value, options); // warning: JSON serialization and deserialization might require types that cannot be statically analyzed. ``` ## API Proposal This issue proposes we expose instance methods in `JsonSerializer` (or some different class): ```C# namespace System.Text.Json; public partial class JsonSerializer // Changed from static class { public JsonSerializerOptions Options { get; } public JsonSerializer(JsonSerializerOptions options); // linker-safe constructor, throws if options.TypeInfoResolver == null; // we might consider adding a linker-unsafe factory that populates TypeInfoResolver with the reflection resolver, like the existing serialization APIs do. [RequiresUnreferencedCode] public static JsonSerializer Default { get; } // serializer wrapping JsonSerializerOptions.Default /* Serialization APIs */ public string Serialize(TValue value); public byte[] SerializeToUtf8Bytes(TValue value); public void Serialize(Utf8JsonWriter utf8Json, TValue value); public void Serialize(Stream utf8Json, TValue value); public void SerializeAsync(Stream utf8Json, TValue value); public JsonDocument SerializeToDocument(TValue value); public JsonElement SerializeToElement(TValue value); public JsonNode SerializeToNode(TValue value); public string Serialize(object? value, Type inputType); public byte[] SerializeToUtf8Bytes(object? value, Type inputType); public void Serialize(Utf8JsonWriter writer, object? value, Type inputType); public void Serialize(Stream utf8Json, object? value, Type inputType); public void SerializeAsync(Stream utf8Json, object? value, Type inputType); public JsonDocument SerializeToDocument(object? value, Type inputType); public JsonElement SerializeToElement(object? value, Type inputType); public JsonNode SerializeToNode(object? value, Type inputType); /* Deserialization APIs */ public TValue? Deserialize(string json); public TValue? Deserialize(ReadOnlySpan json); public TValue? Deserialize(ReadOnlySpan utf8Json); public TValue? Deserialize(ref Utf8JsonReader reader); public TValue? Deserialize(Stream utf8Json); public ValueTask DeserializeAsync(Stream utf8Json); public IAsyncEnumerable DeserializeAsyncEnumerable(Stream utf8Json); public TValue? Deserialize(JsonDocument document); public TValue? Deserialize(JsonElement element); public TValue? Deserialize(JsonNode node); public object? Deserialize(ReadOnlySpan utf8Json, Type returnType); public object? Deserialize(ReadOnlySpan json, Type returnType); public object? Deserialize(string json, Type returnType); public object? Deserialize(JsonDocument document, Type returnType); public object? Deserialize(JsonElement element, Type returnType); public object? Deserialize(JsonNode node, Type returnType); public object? Deserialize(ref Utf8JsonReader reader, Type returnType); public object? Deserialize(Stream utf8Json, Type returnType); public ValueTask DeserializeAsync(Stream utf8Json, Type returnType); } namespace System.Text.Json.Serialization; public partial class JsonSerializerContext { public JsonSerializer Serializer { get; } } ``` ## Usage Examples Using the reflection-based serializer ```C# /* composition root */ var options = new JsonSerializerOptions { Converters = new JsonStringEnumConverter(), TypeInfoResolver = new DefaultJsonTypeInfoResolver() }; var serializer = new JsonSerializer(options); /* usage */ serializer.Serialize(value); ``` Using the source generator ```C# JsonSerializer serializer = MyContext.Default.Serializer; serializer.Serialize(new MyPoco()); // Serializes using the source generator [JsonSerializable(typeof(MyPoco))] public partial class MyContext : JsonSerializerContext {} ``` ### Open Questions * Should we reuse the existing `JsonSerializer` class or introduce a new type? * Should we have an obsoletion plan for static APIs accepting `JsonSerializerOptions`? Related to #65396, #31094, #64646 cc @eerhardt @davidfowl @ericstj

Author:	eiriktsarpalis
Assignees:	-
Labels:	`area-System.Text.Json`
Milestone:	-

krwq commented 2 years ago

IMO while it could be a good idea when we first created this project I'm not so convinced to do it now and creating second set of the same APIs elsewhere just feels weird. I think it might make more sense to create code analyzer to hint people they're using it wrong. I'm also tempted to say if anything maybe put those methods on the options directly but then it would "weird really read": options.Serialize(...)

eerhardt commented 2 years ago

Should we have an obsoletion plan for static APIs accepting JsonSerializerOptions?

Why would we obsolete this?

eiriktsarpalis commented 2 years ago

Should we have an obsoletion plan for static APIs accepting JsonSerializerOptions?

Why would we obsolete this?

We don't need to, strictly speaking, but it seems to be too much of a pit of failure at the moment. Obsoletion could be one way to guide users to the new APIs.

davidfowl commented 2 years ago

Or an analyzer?

layomia commented 2 years ago

Pervasive RequiresUnreferenceCode/RequiresDynamicCode annotations: all serialization methods accepting JsonSerializerOptions have been annotated as linker/AOT-unsafe even though legitimate scenaria exist where this is not the case

This is pretty compelling, but if we're not going to deprecate the existing static APIs, then it seems moot given that there are existing patterns to avoid the linker warnings. New APIs would add to an increasingly high concept count for using the serializer. New fixers to help users to help users detect and switch bad patterns would perhaps be sufficient to solve these problems.

gregsdennis commented 2 years ago

@eiriktsarpalis can you show how your problem examples would change? This might help explain how exposing these methods will solve the problems those scenarios present.

Also, if we've already documented that people shouldn't do these things, how will these changes alter users' behavior?

eiriktsarpalis commented 2 years ago

it seems moot given that there are existing patterns to avoid the linker warnings.

Unfortunately, that is not the case when it comes to using customized contract resolvers. These can only be consumed via the serialization methods that accept the optional JsonSerializerOptions argument which must be marked as RUC/RDC because of their default behavior. A user could write their own source generated contract resolver (and in fact it is conceivable that future iterations of the inbox source generator could target IJsonTypeInfoResolver directly instead of proxying via JsonSerializerContext/JsonMetadataServices). The current APIs however provide no good way of achieving this without producing false-positive linker warnings (and no amount of analyzers/fixers can address that particular problem).

can you show how your problem examples would change? This might help explain how exposing these methods will solve the problems those scenarios present.

I think the "Usage Examples" section might clarify how these could be refactored, but ultimately the idea is that all serialization configuration is encapsulated behind a materialized "JsonSerializer" instance, so passing the right configuration becomes the responsibility of the composition root, rather than the callsite's, which is more conducive to applying DI patterns:


public class MyClass
{
     // Use constructor injection to determine serialization configuration; could be reflection or sourcegen or a mix of both.
     // Currently we need to call into separate sets of APIs in order to avoid linker warnings.
     private readonly JsonSerializer _serializer;
     public MyClass(JsonSerializer serializer) => _serializer = serializer;

     public string GetJson() => _serializer.Serialize(_data); // instance methods do not accept JsonSerializerOptions and do not differentiate between "reflection" and "sourcegen" flavors. There is no need for any linker annotations here.
}

steveharter commented 2 years ago

The primary original reason for static methods was to be zero-alloc at least for simpler scenarios.

Also instantiating a "serializer" class may end up having the same problem we have today for those who just new up the "options" class on every use.

IMO adding the Serialize() methods should be done on JsonSerializerContext or a derived class.

eiriktsarpalis commented 2 years ago

The primary original reason for static methods was to be zero-alloc at least for simpler scenarios.

It's definitely useful for quickly producing JSON when configuration is not a concern, but it definitely suffers from ergonomic concerns in less trivial scenaria. Arguably this can also be achieved via the new JsonSerializerOptions.Default property.

Also instantiating a "serializer" class may end up having the same problem we have today for those who just new up the "options" class on every use.

Agree that this is still a theoretical concern, but even though an optional parameter in a static method is (for most intents and purposes) equivalent to exposing an instance method on the parameter itself, the two approaches present themselves very differently to users not familiar with the library. Arguably the latter approach communicates more clearly that we're dealing with a serialization context and primes users to think of the type in terms of DI.

IMO adding the Serialize() methods should be done on JsonSerializerContext or a derived class.

Agree in principle, but that ship has sailed when it comes to JsonSerializerContext specifically. Even though we shipped APIs that accept JsonSerializerContext as a first-class parameter, its design fundamentally makes it a configuration facet of JsonSerializerOptions (effectively rendering JsonSerializerOptions the only first-class configuration type).

The work in https://github.com/dotnet/runtime/issues/61734 doubles down on that design: starting with .NET 7 JsonSerializerContext is simply one implementation of IJsonTypeInfoResolver, which is yet another configuration setting on JsonSerializerOptions.

Regarding the relationship of JsonSerializerContext and JsonSerializerOptions, I just updated the OP with a further point:

Bifurcation of API surface between reflection and source generation: users need to call into distinct methods depending on whether they use sourcegen or reflection. The distinction between JsonSerializerOptions and JsonSerializerContext has always been tenuous and has been rendered obsolete with the infrastructural changes introduced by https://github.com/dotnet/runtime/issues/63686. Arguably, the source of JSON contracts is a configuration detail that should be dealt with at the composition root and not concern any serialization methods.

eiriktsarpalis commented 1 year ago

Here's a sketch of how one could define a serializer instance using .NET 7 APIs that is completely AOT/linker-safe:

public partial class JsonSerializerInstance
{
    private readonly JsonSerializerOptions _options;

    public JsonSerializerInstance(JsonSerializerOptions options)
    {
        if (options.TypeInfoResolver is null)
        {
            // This is a departure from default semantics in JsonSerializer methods that accept JsonSerializerOptions,
            // but it's essential for preserving AOT/linker safety in the serializer instance constructor.
            throw new ArgumentException("The options parameter must specify a TypeInfoResolver value.", nameof(options));
        }

        _options = options;
    }

    // Serialization/Deserialization methods call into JsonTypeInfo<T> overloads that are marked as linker-safe.
    public string Serialize<T>(T value) => JsonSerializer.Serialize(value, GetTypeInfo<T>());
    public T? Deserialize<T>(string json) => JsonSerializer.Deserialize(json, GetTypeInfo<T>());

    public Task SerializeAsync<T>(Stream utf8Json, T value, CancellationToken cancellationToken = default) 
        => JsonSerializer.SerializeAsync(utf8Json, value, GetTypeInfo<T>(), cancellationToken);

    public ValueTask<T?> DeserializeAsync<T>(Stream utf8Json, CancellationToken cancellationToken = default)
        => JsonSerializer.DeserializeAsync(utf8Json, GetTypeInfo<T>(), cancellationToken);

    private JsonTypeInfo<T> GetTypeInfo<T>() => (JsonTypeInfo<T>)_options.GetTypeInfo(typeof(T));

Any AOT/linker warning annotations are now exclusively the purview of factories configuring serializer instances:

public partial class JsonSerializerInstance
{
    [RequiresDynamicCode("Method uses the default reflection-based resolver which requires dynamic code.")]
    [RequiresUnreferencedCode("Method uses the default reflection-based resolver which requires unreferenced code.")]
    public static JsonSerializerInstance Default { get; } = new(JsonSerializerOptions.Default);

    [RequiresDynamicCode("Method uses the default reflection-based resolver which requires dynamic code.")]
    [RequiresUnreferencedCode("Method uses the default reflection-based resolver which requires unreferenced code.")]
    public static CreateUsingJsonSerializerSemantics(JsonSerializerOptions options)
    {
          options.TypeInfoResolver ??= new DefaultJsonTypeInfoResolver();
          return new(options);
    }

    // No annotation needed since JsonSerializerContext instances are linker safe
    public static JsonSerializerInstance Create(JsonSerializerContext jsonSerializerContext) => new(jsonSerializerContext.Options);
}

As soon as a "JsonSerializerInstance" is materialized, it can be used without concern for linker warnings and is agnostic w.r.t. whether it uses source gen or reflection serialization.

qszhuan commented 1 year ago

why can't you provide a way to set the options globally just like Newtonsoft.Json? The fact with instance method is it won't solve all the issues, as the libraries using JsonSerializer still use the default options.

For example, BinaryData has an method ToObjectFromJson, we can't pass the JsonSerializer instance to it, as it only accepts JsonSerializerOptions.

It will be annoying that if we have to take care of this kind of thing everywhere we uses it.

eiriktsarpalis commented 1 year ago

why can't you provide a way to set the options globally just like Newtonsoft.Json

That approach has its own set of problems, see https://github.com/dotnet/runtime/issues/31094#issuecomment-1235451167

agocke commented 1 year ago

I'm not sure I fully understand how this solves the problem, but I may just not understand Json source generators well enough.

Let's say I want to create an HttpClient with JsonSerialization

var client = new HttpClient(new JsonSerializer(new  MyContext.Default.Serializer));
client.WriteAsJson(new MyType1());
client.WriteAsJson(new MyType2());

How does this work? Does that MyTypeContext contain the JsonSerializer for all of the types automatically? Or do you need to specify which types it contains via the JsonSerializable attribute? If that's right, what if you need more than one context? Is the intent to allow multiple contexts to be passed?

Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?

davidfowl commented 1 year ago

Or do you need to specify which types it contains via the JsonSerializable attribute?

This is the answer.

If that's right, what if you need more than one context? Is the intent to allow multiple contexts to be passed?

You can use this new API to combine contexts.

Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?

I think it depends on where the JsonSerializationContext was defined. If each assembly has their own with the same type, you'd get 3 copies of the source genned context.

eiriktsarpalis commented 1 year ago

Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?

I think it depends on where the JsonSerializationContext was defined. If each assembly has their own with the same type, you'd get 3 copies of the source genned context.

And that should be a fairly common scenario, given that each JsonSerializerContext generates metadata for the entire transitive closure of its type dependencies. The order of arguments in JsonTypeInfoResolver.Combine is significant, as it will return the first result in that sequence that is non-null.

agocke commented 1 year ago

OK, this makes sense then, thanks.

And that should be a fairly common scenario, given that each JsonSerializerContext generates metadata for the entire transitive closure of its type dependencies.

This is good to know. Given that code size is a metric we're tracking for Native AOT we should keep an eye on how this scales out with larger apps. No action for now, though.

eiriktsarpalis commented 1 year ago

Generally speaking it should contribute to code size increases, see https://github.com/dotnet/runtime/issues/77897 as an example.

Sergio0694 commented 1 year ago

I love this proposal so far! Have a few questions as well 😄

JsonSerializer serializer = MyContext.Default.Serializer;
serializer.Serialize(new MyPoco()); // Serializes using the source generator 

[JsonSerializable(typeof(MyPoco))]
public partial class MyContext : JsonSerializerContext
{}

I read the previous conversation here and didn't see anyone commenting on this yet - looking at that example I'm a bit worried about the fact you don't have an explicit T check anymore compared to using a JsonTypeInfo<T> instance. In the Microsoft Store, I've added a few extensions for JsonTypeInfo<T> that mostly just forward to JsonSerializer.<METHOD>(...), passing the input type info and additional arguments (as well as normalizing exceptions as a workaround for #78029 until that's fixed, but that's a separate issue). This way we can easily enforce that using the extension is the only way to serialize/deserialize, so that all callsites would just look like this:

SomeModel model = MicrosoftStoreJsonSerializerContext.Default.SomeModel.Deserialize(jsonText);

This makes it very easy to see that:

The JsonSerializerContext is always being passed when serializing/deserializing
And that the T you're using is definitely present (which is validated at compile time).

In the example above for the new API instead, there's no way to be sure just by looking at the callsite that the JsonTypeInfo<T> for that T is present. As in, one might've forgotten to add the [JsonSerializable] annotation on the context and you'd get no build error there. Are there plans to enforce this, or to maybe add an analyzer to validate callsites to account for this, or something? Additionally, are there any performance concerns in case there's tons of type infos being generated that this approach would have callers pass the whole context (which means it'd have to resolve the right type info), instead of just statically retrieving the correct JsonTypeInfo<T> member directly as a direct property read? 🤔

"Generally speaking it should contribute to code size increases, see https://github.com/dotnet/runtime/issues/77897 as an example."

Just in case there's anyone else curious about that and/or following the issue, we're working on adjusting things on our end as well to help reduce the binary size impact (as a separate effort than the planned improvements for the JSON generator), so we'll share what we've learnt and what numbers we got once we're done with that. I'm hoping we'll be able to have a noticeable binary size reduction compared to that whopping +17MB increase by just tweaking our JSON models 🙂

And of course, all the planned improvements to reuse generate stubs should also help a ton for folks in these situations.