Open eiriktsarpalis opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.
IMO while it could be a good idea when we first created this project I'm not so convinced to do it now and creating second set of the same APIs elsewhere just feels weird. I think it might make more sense to create code analyzer to hint people they're using it wrong. I'm also tempted to say if anything maybe put those methods on the options directly but then it would "weird really read": options.Serialize(...)
Should we have an obsoletion plan for static APIs accepting JsonSerializerOptions?
Why would we obsolete this?
Should we have an obsoletion plan for static APIs accepting JsonSerializerOptions?
Why would we obsolete this?
We don't need to, strictly speaking, but it seems to be too much of a pit of failure at the moment. Obsoletion could be one way to guide users to the new APIs.
Or an analyzer?
Pervasive RequiresUnreferenceCode/RequiresDynamicCode annotations: all serialization methods accepting JsonSerializerOptions have been annotated as linker/AOT-unsafe even though legitimate scenaria exist where this is not the case
This is pretty compelling, but if we're not going to deprecate the existing static APIs, then it seems moot given that there are existing patterns to avoid the linker warnings. New APIs would add to an increasingly high concept count for using the serializer. New fixers to help users to help users detect and switch bad patterns would perhaps be sufficient to solve these problems.
@eiriktsarpalis can you show how your problem examples would change? This might help explain how exposing these methods will solve the problems those scenarios present.
Also, if we've already documented that people shouldn't do these things, how will these changes alter users' behavior?
it seems moot given that there are existing patterns to avoid the linker warnings.
Unfortunately, that is not the case when it comes to using customized contract resolvers. These can only be consumed via the serialization methods that accept the optional JsonSerializerOptions
argument which must be marked as RUC/RDC because of their default behavior. A user could write their own source generated contract resolver (and in fact it is conceivable that future iterations of the inbox source generator could target IJsonTypeInfoResolver
directly instead of proxying via JsonSerializerContext
/JsonMetadataServices
). The current APIs however provide no good way of achieving this without producing false-positive linker warnings (and no amount of analyzers/fixers can address that particular problem).
can you show how your problem examples would change? This might help explain how exposing these methods will solve the problems those scenarios present.
I think the "Usage Examples" section might clarify how these could be refactored, but ultimately the idea is that all serialization configuration is encapsulated behind a materialized "JsonSerializer" instance, so passing the right configuration becomes the responsibility of the composition root, rather than the callsite's, which is more conducive to applying DI patterns:
public class MyClass
{
// Use constructor injection to determine serialization configuration; could be reflection or sourcegen or a mix of both.
// Currently we need to call into separate sets of APIs in order to avoid linker warnings.
private readonly JsonSerializer _serializer;
public MyClass(JsonSerializer serializer) => _serializer = serializer;
public string GetJson() => _serializer.Serialize(_data); // instance methods do not accept JsonSerializerOptions and do not differentiate between "reflection" and "sourcegen" flavors. There is no need for any linker annotations here.
}
The primary original reason for static methods was to be zero-alloc at least for simpler scenarios.
Also instantiating a "serializer" class may end up having the same problem we have today for those who just new up the "options" class on every use.
IMO adding the Serialize() methods should be done on JsonSerializerContext
or a derived class.
The primary original reason for static methods was to be zero-alloc at least for simpler scenarios.
It's definitely useful for quickly producing JSON when configuration is not a concern, but it definitely suffers from ergonomic concerns in less trivial scenaria. Arguably this can also be achieved via the new JsonSerializerOptions.Default
property.
Also instantiating a "serializer" class may end up having the same problem we have today for those who just new up the "options" class on every use.
Agree that this is still a theoretical concern, but even though an optional parameter in a static method is (for most intents and purposes) equivalent to exposing an instance method on the parameter itself, the two approaches present themselves very differently to users not familiar with the library. Arguably the latter approach communicates more clearly that we're dealing with a serialization context and primes users to think of the type in terms of DI.
IMO adding the Serialize() methods should be done on JsonSerializerContext or a derived class.
Agree in principle, but that ship has sailed when it comes to JsonSerializerContext
specifically. Even though we shipped APIs that accept JsonSerializerContext
as a first-class parameter, its design fundamentally makes it a configuration facet of JsonSerializerOptions
(effectively rendering JsonSerializerOptions
the only first-class configuration type).
The work in https://github.com/dotnet/runtime/issues/61734 doubles down on that design: starting with .NET 7 JsonSerializerContext
is simply one implementation of IJsonTypeInfoResolver
, which is yet another configuration setting on JsonSerializerOptions
.
Regarding the relationship of JsonSerializerContext
and JsonSerializerOptions
, I just updated the OP with a further point:
- Bifurcation of API surface between reflection and source generation: users need to call into distinct methods depending on whether they use sourcegen or reflection. The distinction between
JsonSerializerOptions
andJsonSerializerContext
has always been tenuous and has been rendered obsolete with the infrastructural changes introduced by https://github.com/dotnet/runtime/issues/63686. Arguably, the source of JSON contracts is a configuration detail that should be dealt with at the composition root and not concern any serialization methods.
Here's a sketch of how one could define a serializer instance using .NET 7 APIs that is completely AOT/linker-safe:
public partial class JsonSerializerInstance
{
private readonly JsonSerializerOptions _options;
public JsonSerializerInstance(JsonSerializerOptions options)
{
if (options.TypeInfoResolver is null)
{
// This is a departure from default semantics in JsonSerializer methods that accept JsonSerializerOptions,
// but it's essential for preserving AOT/linker safety in the serializer instance constructor.
throw new ArgumentException("The options parameter must specify a TypeInfoResolver value.", nameof(options));
}
_options = options;
}
// Serialization/Deserialization methods call into JsonTypeInfo<T> overloads that are marked as linker-safe.
public string Serialize<T>(T value) => JsonSerializer.Serialize(value, GetTypeInfo<T>());
public T? Deserialize<T>(string json) => JsonSerializer.Deserialize(json, GetTypeInfo<T>());
public Task SerializeAsync<T>(Stream utf8Json, T value, CancellationToken cancellationToken = default)
=> JsonSerializer.SerializeAsync(utf8Json, value, GetTypeInfo<T>(), cancellationToken);
public ValueTask<T?> DeserializeAsync<T>(Stream utf8Json, CancellationToken cancellationToken = default)
=> JsonSerializer.DeserializeAsync(utf8Json, GetTypeInfo<T>(), cancellationToken);
private JsonTypeInfo<T> GetTypeInfo<T>() => (JsonTypeInfo<T>)_options.GetTypeInfo(typeof(T));
Any AOT/linker warning annotations are now exclusively the purview of factories configuring serializer instances:
public partial class JsonSerializerInstance
{
[RequiresDynamicCode("Method uses the default reflection-based resolver which requires dynamic code.")]
[RequiresUnreferencedCode("Method uses the default reflection-based resolver which requires unreferenced code.")]
public static JsonSerializerInstance Default { get; } = new(JsonSerializerOptions.Default);
[RequiresDynamicCode("Method uses the default reflection-based resolver which requires dynamic code.")]
[RequiresUnreferencedCode("Method uses the default reflection-based resolver which requires unreferenced code.")]
public static CreateUsingJsonSerializerSemantics(JsonSerializerOptions options)
{
options.TypeInfoResolver ??= new DefaultJsonTypeInfoResolver();
return new(options);
}
// No annotation needed since JsonSerializerContext instances are linker safe
public static JsonSerializerInstance Create(JsonSerializerContext jsonSerializerContext) => new(jsonSerializerContext.Options);
}
As soon as a "JsonSerializerInstance" is materialized, it can be used without concern for linker warnings and is agnostic w.r.t. whether it uses source gen or reflection serialization.
why can't you provide a way to set the options globally just like Newtonsoft.Json
? The fact with instance method is it won't solve all the issues, as the libraries using JsonSerializer still use the default options.
For example, BinaryData
has an method ToObjectFromJson, we can't pass the JsonSerializer instance to it, as it only accepts JsonSerializerOptions
.
It will be annoying that if we have to take care of this kind of thing everywhere we uses it.
why can't you provide a way to set the options globally just like Newtonsoft.Json
That approach has its own set of problems, see https://github.com/dotnet/runtime/issues/31094#issuecomment-1235451167
I'm not sure I fully understand how this solves the problem, but I may just not understand Json source generators well enough.
Let's say I want to create an HttpClient with JsonSerialization
var client = new HttpClient(new JsonSerializer(new MyContext.Default.Serializer));
client.WriteAsJson(new MyType1());
client.WriteAsJson(new MyType2());
How does this work? Does that MyTypeContext contain the JsonSerializer for all of the types automatically? Or do you need to specify which types it contains via the JsonSerializable
attribute? If that's right, what if you need more than one context? Is the intent to allow multiple contexts to be passed?
Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?
Or do you need to specify which types it contains via the JsonSerializable attribute?
This is the answer.
If that's right, what if you need more than one context? Is the intent to allow multiple contexts to be passed?
You can use this new API to combine contexts.
Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?
I think it depends on where the JsonSerializationContext
was defined. If each assembly has their own with the same type, you'd get 3 copies of the source genned context.
Also, does this mean that each type may get a source-gen'd implementation generated multiple times in each assembly?
I think it depends on where the
JsonSerializationContext
was defined. If each assembly has their own with the same type, you'd get 3 copies of the source genned context.
And that should be a fairly common scenario, given that each JsonSerializerContext
generates metadata for the entire transitive closure of its type dependencies. The order of arguments in JsonTypeInfoResolver.Combine
is significant, as it will return the first result in that sequence that is non-null.
OK, this makes sense then, thanks.
And that should be a fairly common scenario, given that each JsonSerializerContext generates metadata for the entire transitive closure of its type dependencies.
This is good to know. Given that code size is a metric we're tracking for Native AOT we should keep an eye on how this scales out with larger apps. No action for now, though.
Generally speaking it should contribute to code size increases, see https://github.com/dotnet/runtime/issues/77897 as an example.
I love this proposal so far! Have a few questions as well π
JsonSerializer serializer = MyContext.Default.Serializer; serializer.Serialize(new MyPoco()); // Serializes using the source generator [JsonSerializable(typeof(MyPoco))] public partial class MyContext : JsonSerializerContext {}
I read the previous conversation here and didn't see anyone commenting on this yet - looking at that example I'm a bit worried about the fact you don't have an explicit T
check anymore compared to using a JsonTypeInfo<T>
instance. In the Microsoft Store, I've added a few extensions for JsonTypeInfo<T>
that mostly just forward to JsonSerializer.<METHOD>(...)
, passing the input type info and additional arguments (as well as normalizing exceptions as a workaround for #78029 until that's fixed, but that's a separate issue). This way we can easily enforce that using the extension is the only way to serialize/deserialize, so that all callsites would just look like this:
SomeModel model = MicrosoftStoreJsonSerializerContext.Default.SomeModel.Deserialize(jsonText);
This makes it very easy to see that:
JsonSerializerContext
is always being passed when serializing/deserializingT
you're using is definitely present (which is validated at compile time).In the example above for the new API instead, there's no way to be sure just by looking at the callsite that the JsonTypeInfo<T>
for that T
is present. As in, one might've forgotten to add the [JsonSerializable]
annotation on the context and you'd get no build error there. Are there plans to enforce this, or to maybe add an analyzer to validate callsites to account for this, or something? Additionally, are there any performance concerns in case there's tons of type infos being generated that this approach would have callers pass the whole context (which means it'd have to resolve the right type info), instead of just statically retrieving the correct JsonTypeInfo<T>
member directly as a direct property read? π€
"Generally speaking it should contribute to code size increases, see https://github.com/dotnet/runtime/issues/77897 as an example."
Just in case there's anyone else curious about that and/or following the issue, we're working on adjusting things on our end as well to help reduce the binary size impact (as a separate effort than the planned improvements for the JSON generator), so we'll share what we've learnt and what numbers we got once we're done with that. I'm hoping we'll be able to have a noticeable binary size reduction compared to that whopping +17MB increase by just tweaking our JSON models π
And of course, all the planned improvements to reuse generate stubs should also help a ton for folks in these situations.
Unfortunately, that is not the case when it comes to using customized contract resolvers. These can only be consumed via the serialization methods that accept the optional JsonSerializerOptions argument which must be marked as RUC/RDC because of their default behavior.
@eiriktsarpalis we could design API to attach a custom resolver to a context instance. Thought would have to into making it as clean as possible, but it should be just a wrapping.
OK, this makes sense then, thanks.
And that should be a fairly common scenario, given that each JsonSerializerContext generates metadata for the entire transitive closure of its type dependencies.
This is good to know. Given that code size is a metric we're tracking for Native AOT we should keep an eye on how this scales out with larger apps. No action for now, though.
@eiriktsarpalis @agocke @davidfowl it might be good to think about it now. In the first release of the source generator, we decided not to introduce a global cache for sharing/abstracting metadata between assemblies. I think we should revisit the decision. Some potential challenges with this approach which would be good to keep in mind are documented here.
My idea at the time (toward the end of .NET 6 dev time) was to use Assembly.GetCallingAssembly
as a mechanism to partition the cache but @jkotas pointed out that it was slow and unreliable. His thought was that the solution requires new thinking (in .NET 7 or after) across the ASP.NET, serializer(s), source generators, and IL linker.
Wanted to share this now in case it sparks some ideas.
Also cc @eerhardt - https://github.com/dotnet/runtime/issues/74492#issuecomment-1383298979.
The callsite rewriting gives you fast reliable equivalent of Assembly.GetCallingAssembly
, but I do not see how it can help to solve the versioning problems described in the blog post.
Gotcha. If we have a design to reduce the size of source generated code (https://github.com/dotnet/runtime/issues/77897) with the primary mechanism being shared caches between assemblies, do you think it would be incomplete if it doesn't solve the versioning issues? I recognize this area needs careful evaluation of the trade-offs (guided by .NET 8 goals).
I think large projects have high probability of running into versioning issues. It is very common for large projects to run into versioning issues with Newtonsoft.Json today. We do not want source generators turn into a new source of versioning problems that are impossible to solve.
If we are considering schemes that allow sharing of source generated serializers, I think we need to design it such that there is always an option to make explicit choice about when to share and when not share the source generated serializers.
The off-topic conversation was about reducing the size footprint of src-gen'd code. It was moved to the appropriate issue: https://github.com/dotnet/runtime/issues/77897#issuecomment-1383852837.
looking at that example I'm a bit worried about the fact you don't have an explicit T check anymore compared to using a JsonTypeInfo
instance.
The proposal concerns exposing functionality that supersedes the JsonSerializer
methods that accept JsonSerializerOptions
parameter. The overloads accepting JsonTypeInfo
don't have the aforementioned issues and as such the recommendation is to keep using them for statically typed serialization. I considered the possibility of proposing Serialize
and Deserialize
as extension methods on JsonTypeInfo
instances themselves, but ultimately wouldn't provide much benefit compared to the existing APIs -- as you're pointing out they're fairly easy to define if necessary.
Ah, gotcha, yeah that makes sense, thank you for clarifying! π
Somewhat related, this also makes me think whether people currently suffering from the issues described here (eg. creating options every time or forgetting to pass them) couldn't already solve the issue by just defining multiple option accessors in the context and then just statically accessing that like with the default one π€
To make an example, one thing we're doing in the Store is, we have some cases where we want to serialize/deserialize with case invariant properties (but it could be any example of "some statically known set of options different than the default ones to customize a given serialization operation"). What we did is to just add a separate property to our JSON context, like so:
public sealed partial class MicrosoftStoreJsonSerializerContext : JsonSerializerContext
{
private static MicrosoftStoreJsonSerializerContext? _camelCase;
public static MicrosoftStoreJsonSerializerContext CamelCase => _camelCase ??= new(new JsonSerializerOptions(s_defaultOptions) { PropertyNamingPolicy = JsonNamingPolicy.CamelCase });
}
This way whenever we need to use this, we simply use CamelCase
instead of Default
and keep the same pattern:
SomeModel model = MicrosoftStoreJsonSerializerContext.CamelCase.SomeModel.Deserialize(json);
I wonder whether at least for cases where the options being used only have statically known properties (eg. like the naming policy here) it wouldn't be simpler for people to use this approach instead, which also doesn't need any new APIs. I guess I'm just curious because by just looking at the examples, for cases where again you only have statically known options, it's not immediately clear to me why should one use the new instance method approach to replace using JsonSerializer
overloads taking explicit options, instead of just statically accessing the shared JSON context with the embedded custom options like above (which as mentioned also doesn't need any new API). Similarly for cases where JsonTypeInfo<T>
isn't used, this approach also lets one just pass YourJsonContext.TheCustomProperty.Options
to access the shared options to pass to JsonSerializer
, which also avoids the issue of potentially creating options for every call. To be clear, just trying to understand this proposal better π
Creating a new JsonSerializerOptions instance on each serialization
people currently suffering from the issues described here (eg. creating options every time
The other reasons may be sufficient to add such APIs, but I continue to believe adding them to avoid the pitfalls of creating options every time is misguided. We have plenty of evidence developers are just as happy to create a new instance of a serializer every time they need to serialize/deserialize something (see practically every use of BinaryFormatter), and even outside of the serializers (e.g. HttpClient), even when guidance strongly urges otherwise.
We need runtime analyzers π
Also instantiating a "serializer" class may end up having the same problem we have today for those who just new up the "options" class on every use.
I think people are a lot less likely to write new JsonSerializer().Serialize(obj)
than JsonSerializer.Serialize(obj, new JsonSerializerOptions(...))
Any update on this ?
I personally don't want to add extra attribute on every single Properties of my custom type.
@eiriktsarpalis
This design is not being pursued for the near term.
This design is not being pursued for the near term.
I may have misunderstood your words, but I have something like the following:
public class Customer
{
....
[JsonConverter(typeof(MsisdnConverter))]
public MSISDN Msisdn { get; set; }
....
}
public class SubscriberCard
{
....
[JsonConverter(typeof(MsisdnConverter))]
public MSISDN Msisdn { get; set; }
....
}
I have defined Msisdn
as a building block of my enterprise application.
I don't want to be worry about missing [JsonConverter(typeof(MsisdnConverter))]
in the rest of my project (we have several developers in our team and it's totally possible). Similar to what System.Text.Json
does with DateTime
, I want to register a JsonConverter
globally.
I believe this requirement is meaningful, and I have searched for information on it (Note: Newtonsoft.Json
has a way of handling this).
I found this Issue to be the closest one. I can provide you with a design for this feature, but I still believe it's related to this Issue.
@eiriktsarpalis
This appears to be unrelated to what is being discussed in this issue. You can apply a custom converter for your MSISDN
type globally today either by applying a JsonConverter
annotation on the type itself or by specifying it on the JsonSerializerOptions.Converters
property.
@eiriktsarpalis At https://github.com/dotnet/runtime/issues/31094#issuecomment-1166232038, you hinted that you'd prefer a design without an interface, and that is what you are proposing here as well.
I'm curious, what was the reasoning behind the team choosing not to define an abstract interface for this? Why have only the static API?
If we had an interface, then we'd at least have a standard way to encapsulate a serializer and for libraries to depend on that.
Also, that issue remains, to this day, one of the top liked issues in this repo, even though it has been locked -- thus freezing reactions -- for two years now, so there seems to be clear interest from the community. Could the team share any insights as to why this is not currently a priority? Is there anything the community can do to help?
Background & Motivation
The current design of the
JsonSerializer
class exposes all serialization functionality as static methods accepting configuration parametrically (and optionally). In hindsight, this design has contributed to a few ergonomic problems when using this API:Forgetting to pass
JsonSerializerOptions
to the serialization methods:This is probably the most common issue -- I've personally fallen for this too many times.
Creating a new
JsonSerializerOptions
instance on each serialization:Even though this anti-pattern has been explicitly documented as a potential performance bug, users keep falling for it. We've made attempts to mitigate the problem by implementing shared metadata caches but ultimately the underlying issue still affects users. See also #65396 for plans an adding an analyzer in this space.
Pervasive
RequiresUnreferenceCode
/RequiresDynamicCode
annotations: all serialization methods acceptingJsonSerializerOptions
have been annotated as linker/AOT-unsafe even though legitimate scenaria exist where this is not the case:Bifurcation of API surface between reflection and source generation: users need to call into distinct methods depending on whether they use sourcegen or reflection. The distinction between
JsonSerializerOptions
andJsonSerializerContext
has always been tenuous and has been rendered obsolete with the infrastructural changes introduced by https://github.com/dotnet/runtime/issues/63686. Arguably, the source of JSON contracts is a configuration detail that should be dealt with at the composition root and not concern any serialization methods.API Proposal
This issue proposes we expose instance methods in
JsonSerializer
(or some different class):Usage Examples
Using the reflection-based serializer
Using the source generator
Open Questions
JsonSerializer
class or introduce a new type?JsonSerializerOptions
?Related to #65396, #31094, #64646
cc @eerhardt @davidfowl @ericstj