dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.37k stars 4.75k forks source link

Size saving opportunity for System.Text.Json generators with custom converters #84015

Closed Sergio0694 closed 1 year ago

Sergio0694 commented 1 year ago

Description

Note: related to #77897 and more generally #79122

I was looking at the generated code from System.Text.Json in a few scenarios and noticed another case where we could get some small but nice size saving improvements, in a way that's proportional to how many custom converters a given user has. Consider:

[JsonConverter(typeof(MyConverter<MyEnum>))]
public enum MyEnum
{
}

If this type is directly or indirectly discovered by the generator, you'll end up with the following:

Full generated code (click to expand):
```csharp // #nullable enable annotations #nullable disable warnings // Suppress warnings about [Obsolete] member usage in generated code. #pragma warning disable CS0618 public sealed partial class MyContext { private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? _MyEnum; /// /// Defines the source generated JSON serialization contract metadata for a given type. /// public global::System.Text.Json.Serialization.Metadata.JsonTypeInfo MyEnum { get => _MyEnum ??= Create_MyEnum(Options, makeReadOnly: true); } private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo Create_MyEnum(global::System.Text.Json.JsonSerializerOptions options, bool makeReadOnly) { global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? jsonTypeInfo = null; global::System.Text.Json.Serialization.JsonConverter? customConverter; if (options.Converters.Count > 0 && (customConverter = GetRuntimeProvidedCustomConverter(options, typeof(global::MyEnum))) != null) { jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo(options, customConverter); } else { global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); global::System.Type typeToConvert = typeof(global::MyEnum); if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); } if (makeReadOnly) { jsonTypeInfo.MakeReadOnly(); } return jsonTypeInfo; } } ```

In particular, this bit:

global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter<global::MyEnum>();
global::System.Type typeToConvert = typeof(global::MyEnum);
if (!converter.CanConvert(typeToConvert))
{
    throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert));
}

That extra check is done inline for every single type that has a custom converter, which means you also repeat the full codegen for that interpolation + throw expression. I'm thinking there could be 2 possible options to fix this, each with pros and cons.

Option 1

The generator could simply emit a single shared helper once if there is at least a type using a custom converter in the JSON context geing generated. Then, it'd just reuse that shared helper every time it's emitting code to create a custom type converter:

[MethodImpl(MethodImplOptions.NoInlining)]
private static void ValidateCustomConverter(JsonConverter converter, Type typeToConvert)
{
    if (!converter.CanConvert(typeToConvert))
    {
        throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert));
    }
}

And then all other callsites would simply call it:

global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter<global::MyEnum>();
ValidateCustomConverter(converter, typeof(global::MyEnum));
jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo<global::MyEnum> (options, converter);

Option 2

If we can add the check to JsonMetadataServices.CreateValueInfo, this would be even simpler and likely use less size still:

global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter<global::MyEnum>();
jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo<global::MyEnum> (options, converter);

And then that CreateValueInfo call would just throw if the input converter doesn't support type T. Which I feel would make sense to test anyway in case it doesn't already? As in, why would you want a value info for a converter that's just invalid anyway?

As a bonus point, this would also simplify the generator a bit, since it wouldn't have to generate that check at all anymore.

I don't expect this to bring in massive size deltas, but still seems like an improvement that would make sense?

cc. @eiriktsarpalis @layomia @krwq

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.

Issue Details
### Description > **Note**: related to #77897 and more generally #79122 I was looking at the generated code from System.Text.Json in a few scenarios and noticed another case where we could get some small but nice size saving improvements, in a way that's proportional to how many custom converters a given user has. Consider: ```csharp [JsonConverter(typeof(MyConverter))] public enum MyEnum { } ``` If this type is directly or indirectly discovered by the generator, you'll end up with the following:
Full generated code (click to expand):
```csharp // #nullable enable annotations #nullable disable warnings // Suppress warnings about [Obsolete] member usage in generated code. #pragma warning disable CS0618 public sealed partial class MyContext { private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? _MyEnum; /// /// Defines the source generated JSON serialization contract metadata for a given type. /// public global::System.Text.Json.Serialization.Metadata.JsonTypeInfo MyEnum { get => _MyEnum ??= Create_MyEnum(Options, makeReadOnly: true); } private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo Create_MyEnum(global::System.Text.Json.JsonSerializerOptions options, bool makeReadOnly) { global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? jsonTypeInfo = null; global::System.Text.Json.Serialization.JsonConverter? customConverter; if (options.Converters.Count > 0 && (customConverter = GetRuntimeProvidedCustomConverter(options, typeof(global::MyEnum))) != null) { jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo(options, customConverter); } else { global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); global::System.Type typeToConvert = typeof(global::MyEnum); if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); } if (makeReadOnly) { jsonTypeInfo.MakeReadOnly(); } return jsonTypeInfo; } } ```
In particular, this bit: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); global::System.Type typeToConvert = typeof(global::MyEnum); if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } ``` That extra check is done inline for every single type that has a custom converter, which means you also repeat the full codegen for that interpolation + throw expression. I'm thinking there could be 2 possible options to fix this, each with pros and cons. ## Option 1 The generator could simply emit a single shared helper once if there is at least a type using a custom converter in the JSON context geing generated. Then, it'd just reuse that shared helper every time it's emitting code to create a custom type converter: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void ValidateCustomConverter(JsonConverter converter, Type typeToConvert) { if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } } ``` And then all other callsites would simply call it: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); ValidateCustomConverter(converter, typeof(global::MyEnum)); jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); ``` ## Option 2 If we can add the check to `JsonMetadataServices.CreateValueInfo`, this would be even simpler and likely use less size still: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); ``` And then that `CreateValueInfo` call would just throw if the input converter doesn't support type `T`. Which I feel would make sense to test anyway in case it doesn't already? As in, why would you want a value info for a converter that's just invalid anyway? I don't expect this to bring in massive size deltas, but still seems like an improvement that would make sense? cc. @eiriktsarpalis @layomia @krwq ### Reproduction Steps n/a ### Expected behavior n/a ### Actual behavior n/a ### Regression? _No response_ ### Known Workarounds _No response_ ### Configuration _No response_ ### Other information _No response_
Author: Sergio0694
Assignees: -
Labels: `area-System.Text.Json`, `untriaged`
Milestone: -
eiriktsarpalis commented 1 year ago

I definitely think we could avoid quite a bit of duplication in the converter resolution logic πŸ‘

ghost commented 1 year ago

Tagging subscribers to 'size-reduction': @eerhardt, @SamMonoRT, @marek-safar See info in area-owners.md if you want to be subscribed.

Issue Details
### Description > **Note**: related to #77897 and more generally #79122 I was looking at the generated code from System.Text.Json in a few scenarios and noticed another case where we could get some small but nice size saving improvements, in a way that's proportional to how many custom converters a given user has. Consider: ```csharp [JsonConverter(typeof(MyConverter))] public enum MyEnum { } ``` If this type is directly or indirectly discovered by the generator, you'll end up with the following:
Full generated code (click to expand):
```csharp // #nullable enable annotations #nullable disable warnings // Suppress warnings about [Obsolete] member usage in generated code. #pragma warning disable CS0618 public sealed partial class MyContext { private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? _MyEnum; /// /// Defines the source generated JSON serialization contract metadata for a given type. /// public global::System.Text.Json.Serialization.Metadata.JsonTypeInfo MyEnum { get => _MyEnum ??= Create_MyEnum(Options, makeReadOnly: true); } private global::System.Text.Json.Serialization.Metadata.JsonTypeInfo Create_MyEnum(global::System.Text.Json.JsonSerializerOptions options, bool makeReadOnly) { global::System.Text.Json.Serialization.Metadata.JsonTypeInfo? jsonTypeInfo = null; global::System.Text.Json.Serialization.JsonConverter? customConverter; if (options.Converters.Count > 0 && (customConverter = GetRuntimeProvidedCustomConverter(options, typeof(global::MyEnum))) != null) { jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo(options, customConverter); } else { global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); global::System.Type typeToConvert = typeof(global::MyEnum); if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); } if (makeReadOnly) { jsonTypeInfo.MakeReadOnly(); } return jsonTypeInfo; } } ```
In particular, this bit: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); global::System.Type typeToConvert = typeof(global::MyEnum); if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } ``` That extra check is done inline for every single type that has a custom converter, which means you also repeat the full codegen for that interpolation + throw expression. I'm thinking there could be 2 possible options to fix this, each with pros and cons. ## Option 1 The generator could simply emit a single shared helper once if there is at least a type using a custom converter in the JSON context geing generated. Then, it'd just reuse that shared helper every time it's emitting code to create a custom type converter: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void ValidateCustomConverter(JsonConverter converter, Type typeToConvert) { if (!converter.CanConvert(typeToConvert)) { throw new global::System.InvalidOperationException(string.Format("The converter '{0}' is not compatible with the type '{1}'.", converter.GetType(), typeToConvert)); } } ``` And then all other callsites would simply call it: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); ValidateCustomConverter(converter, typeof(global::MyEnum)); jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); ``` ## Option 2 If we can add the check to `JsonMetadataServices.CreateValueInfo`, this would be even simpler and likely use less size still: ```csharp global::System.Text.Json.Serialization.JsonConverter converter = new global::MyConverter(); jsonTypeInfo = global::System.Text.Json.Serialization.Metadata.JsonMetadataServices.CreateValueInfo (options, converter); ``` And then that `CreateValueInfo` call would just throw if the input converter doesn't support type `T`. Which I feel would make sense to test anyway in case it doesn't already? As in, why would you want a value info for a converter that's just invalid anyway? As a bonus point, this would also simplify the generator a bit, since it wouldn't have to generate that check at all anymore. I don't expect this to bring in massive size deltas, but still seems like an improvement that would make sense? cc. @eiriktsarpalis @layomia @krwq
Author: Sergio0694
Assignees: -
Labels: `area-System.Text.Json`, `size-reduction`
Milestone: -
Sergio0694 commented 1 year ago

@eiriktsarpalis leaving some binary size numbers here on the Microsoft Store in case they're useful πŸ™‚

System.Text.Json version Package size x64 binary size
8.0.0-preview.5.23263.4 65.621 KB 46.926 KB
8.0.0-preview.5.23275.6 65.577 KB 46.933 KB

Deltas (mostly because of https://github.com/dotnet/runtime/pull/86526, I assume):

Not entirely sure why the x64 (and Arm64/x86) binaries seem very marginally larger, but the overall package has had a slight improvement anyway, so looks good! Those generator changes were very minimal anyway so no major deltas were expected, but figured I might share numbers just in case. While I'm at it, I can also share the current numbers compared to 7.0.0.

System.Text.Json version Package size x64 binary size
7.0.0 67.275 KB 49.671 KB
8.0.0-preview.5.23275.6 65.577 KB 46.933 KB

Deltas:

Looking great!! πŸš€

jeffhandley commented 1 year ago

Thanks for sharing those results, @Sergio0694!

eiriktsarpalis commented 1 year ago

Thanks for sharing. The diff around #86526 seems reasonable, however I'm surprised that the difference with .NET 7 is so small. When measuring size in aspnet's BasicMinimalAPI I'm seeing reductions in the order of 80%. Would you be able to share size numbers of your classes when publishing to Native AOT?

Sergio0694 commented 1 year ago

"I'm surprised that the difference with .NET 7 is so small"

Just to clarify (for others that might be reading too), all those numbers I just shared are with .NET Native, not NativeAOT. And yeah sure thing, I'll run the same tests on that minimal repro with the same JSON models (the one we also used to investigate #84922) with NativeAOT and share numbers with that as well using all packages I also used here πŸ™‚

Sergio0694 commented 1 year ago

Tested again, numbers look pretty good! Note: currently testing with our contracts package which is just .NET Standard 2.0, and with everything left to default values. Basically the same setup we're using on UWP, except that the test project here was .NET 7/8 instead. For the .NET 8 version, I've also set JsonSerializerIsReflectionEnabledByDefault to false.

Runtime System.Text.Json version x64 binary size
.NET 7 7.0.2 22.816 KB
.NET 7 8.0.0-preview.6.23279.6 n/a
.NET 8 8.0.100-preview.4.23260.5 9.800 KB
.NET 8 8.0.0-preview.6.23279.6 9.747 KB

Deltas:

Looking really really good on .NET 8! πŸš€

Would you like me to open a separate issue to track this, in case it's useful? I could periodically share numbers with our setup. Also, would it maybe be useful to add this project to the JSON testing suite for size regressions and whatnot? Is this something you might be interested in, and would it be doable even though we can't make our source public? πŸ€”


Warning: building with .NET 7 and 8.0.0-preview.6.23279.6 produces a whole lot of ILC errors about missing methods. The resulting binary was way too small (around 7 MB), I think it was likely invalid due to the linker nuking methods it thought would just always throw anyway due to the various compile errors. Is this a separate issue we should track?

ILC: Method '[MicrosoftStore.DataContracts]MicrosoftStore.DataContracts.MicrosoftStoreJsonSerializerContext.GetTypeInfo(Type)' wi
ll always throw because: Missing method 'Boolean System.Text.Json.JsonSerializerOptions.TryGetTypeInfo(System.Type, System.Text.Json.Serialization.Metadata.JsonTypeInfo ByRef)'
ILC: Method '[MicrosoftStore.DataContracts]MicrosoftStore.DataContracts.MicrosoftStoreJsonSerializerContext.Create_SomeModel(JsonSerializerOptions)' will always throw because: Missing method 'Void System.Text.Json.Serialization.Metadata.JsonTypeInfo.set_OriginatingResolver(System.Text.Json.Serialization.Metadata.IJsonTypeInfoResolver)'
ILC: Method '[MicrosoftStore.DataContracts]MicrosoftStore.DataContracts.MicrosoftStoreJsonSerializerContext.Create_SomeOtherModel(JsonSerializerOptions)' will always throw because: Missing method 'Void System.Text.Json.Serialization.Metadata.JsonTypeInfo.set_OriginatingResolver(System.Text.Json.Serialization.Metadata.IJsonTypeInfoResolver)'
...
eiriktsarpalis commented 1 year ago

building with .NET 7 and 8.0.0-preview.6.23279.6 produces a whole lot of ILC errors about missing methods.

These are all methods added in .NET 8, I suspect that even though you were using the v8 STJ package at build time, the linker somehow ended up with a v7 System.Text.Json run time dll. It could be an issue with your project configuration or a bug in the linker infrastructure.