dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.1k stars 4.7k forks source link

8.4 MB size regression updating System.Text.Json to 8.0.0 #84922

Closed Sergio0694 closed 1 year ago

Sergio0694 commented 1 year ago

Description

Now that #84895 has been fixed, I was able to start dogfooding System.Text.Json 8.0.0 builds in the Microsoft Store, to gather some numbers so we can more easily track the binary size difference with this new release. This is where I've noticed an problem: my understanding is that the new 8.0.0 release is expected to include several binary size improvements, which should make the update result in a smaller binary size footprint (see #79122), but I'm actually seeing a regression, and not even a trivial one πŸ₯²

Configuration

System.Text.Json version Microsoft Store package size
7.0.2 66.410 KB
8.0.0-preview.4.23216.5 70.015 KB

Delta: +3.605 KB (+5.4%). The real delta for the x64 binary is actually ~8.4 MB (ie. ~17%), see details below.

Regression?

Yes. The binary size is actually, surprisingly, lower on 7.0.2.

cc. @eiriktsarpalis @layomia

I assume there's definitely something wrong going on here? Ie. my assumption would be that updating to 8.0.0 should definitely result in a net decrease of binary size, no? πŸ€”

Happy to help test any preview bits necessary and/or to provide more info (internally if needed, as the project is not public).

jkotas commented 1 year ago

update result in a smaller binary size footprint (see https://github.com/dotnet/runtime/issues/79122),

These AOT binary size footprint improvements assume that the reflection-based fallback is trimmed. Is the reflection-based fallback trimmed in your setup?

Sergio0694 commented 1 year ago

"Is the reflection-based fallback trimmed in your setup?"

We have trimming enabled for the whole app, with just these 4 directives for System.Text.Json:

<Type Name="System.Text.Json.Serialization.Converters.ObjectDefaultConverter`1" Activate="Required Public" />
<Type Name="System.Text.Json.Serialization.Converters.DictionaryOfTKeyTValueConverter`3" Activate="Required Public" />
<Type Name="System.Text.Json.Serialization.Converters.ListOfTConverter`2" Activate="Required Public" />
<Type Name="MicrosoftStore.SomeNamespace.SomeOtherCustomConverterType" Activate="Required Public" />

Which are needed to work around #78029. These have been in place since we started migrating to System.Text.Json though, so they're not new. I guess I could also try removing these just as a sanity check, but I'd really be surprised if these had anything to do with the issue here. What I find confusing is that the .NET Native linker is generally quite aggressive with trimming, so such a big increase is unexpected. It seems 8.0 is somehow causing a whole lot more (like, a lot more) stuff to get rooted πŸ€”

I've kicked off a couple builds with <DebugType>full</DebugType> to see if I can open them with SizeBench`, in case that helps. I could really use another pair of eyes here though, I'm not exactly sure what to look for πŸ˜…

EDIT: if we don't come up with any other ideas and that SizeBench tool doesn't help, one option could also be for me to try cherry-picking #84899 into the various release branches for the .NET 8 Previews, build them and then check the size with each of them? At least that would help us narrow down when exactly things went wrong?

jkotas commented 1 year ago

Is the code under https://github.com/dotnet/runtime/search?q=IsReflectionEnabledByDefault getting trimmed in your app?

Sergio0694 commented 1 year ago

Mmh pretty sure .NET Native doesn't handle AppContext switches at link time, so yeah I'd assume those paths aren't trimmed (@MichalStrehovsky can confirm, we also talked about this briefly in #83279). I'm confused as to why this would regress the size though, in 7.0.0 wouldn't the code be the equivalent of reflection always being enabled by default no matter what? In other words, I could understand .NET Native not trimming as much as it could in .NET 8, but why would the size regress compared to 7.0.0, shouldn't at worst just remain the same? πŸ€”

jkotas commented 1 year ago

Size of reflection engine is likely to grow over time as more features get added to it. I may explain at least some of the size growth that you are seeing.

SizeBench tool

Yes, it would be useful to collect the data on where the growth is coming from.

Sergio0694 commented 1 year ago

I'm struggling to get SizeBench working, do you know if it actually supports .NET Native binaries? πŸ€” I tried building with <DebugType>full</DebugType> full and then I grabbed the .dll and .pdb from \Release\x64\WinStore.App\ilc\, but SizeBench refuses to open them:

Would you by any chance know if there's a way to specify this mode from an MSBuild property or something? If not, Anton suggested building locally with /KeepIntermediates, then changing the link.exe call in the .rsp file to add that link option and then rebuilding with ilc.exe directly, but I was hoping to just have a way to configure this at the project level so I could just run a couple of pipelines to produce the artifacts as usual? πŸ₯²

jkotas commented 1 year ago

IIRC, x64 binaries are not produced by link.exe in UWP toolchain. They are produced by mdil binder that is apparently missing some of the details that SizeBench expects. You can try building and comparing arm64 binaries. Arm64 is produced by link.exe and so you may have better luck with them.

Sergio0694 commented 1 year ago

Ah, right, I forgot the Arm64 linker was different than the x64 one πŸ˜„ I tried it and I did get past that initial error, but then unfortunately SizeBench still failed, this time with an exception

Stack trace (click to expand):
``` Exception information: System.InvalidOperationException: New xdata handler type found - no parser available yet. Exception Handler: __Managed_Main (RVA:0x0) Function with that handler: .DllGetActivationFactory (RVA:0xE8004) HRESULT: 0x80131509 at SizeBench.AnalysisEngine.PE.EHSymbolParser.ParseOneExceptionHandler(Symbol targetSymbol, UInt32 rfStartRva, UInt32 rfUnwindInfoRva, Byte* pExceptionHandlerRva, UInt32 exceptionHandlerRva, Byte* unwindInfoStart) at SizeBench.AnalysisEngine.PE.EHSymbolParser.ParseOneExceptionHandler(Symbol targetSymbol, UInt32 rfStartRva, UInt32 rfUnwindInfoRva, Byte* pExceptionHandlerRva, UInt32 exceptionHandlerRva, Byte* unwindInfoStart) at SizeBench.AnalysisEngine.PE.ARM_EHParser.ParseOneXData(Symbol targetSymbol, UInt32 functionStartRva, UInt32 ehMetadata) at SizeBench.AnalysisEngine.PE.ARM_EHParser.ParseXDataForArchitecture(UInt32 sectionAlignment, RVARange XDataRVARange, SessionDataCache cache) at SizeBench.AnalysisEngine.PE.EHSymbolParser.Parse(UInt32 sectionAlignment, RVARange XDataRVARange, SessionDataCache cache, ILogger logger) at SizeBench.AnalysisEngine.PE.EHSymbolTable.Parse(Byte* libraryBaseAddress, UInt32 sectionAlignment, SessionDataCache dataCache, IDIAAdapter diaAdapter, MachineType machine, RVARange XDataRVARange, ILogger logger) at SizeBench.AnalysisEngine.PE.PEFile.ParseEHSymbols(Session session, IDIAAdapter diaAdapter, RVARange XDataRVARange, ILogger logger) at SizeBench.AnalysisEngine.DIAInterop.DIAAdapter.Initialize(PEFile peFile, ILogger logger) at SizeBench.AnalysisEngine.Session.InitializeDIAThread() at System.Threading.Tasks.Task.InnerInvoke() at System.Threading.Tasks.Task.<>c.<.cctor>b__272_0(Object obj) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) --- End of stack trace from previous location --- at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) --- End of stack trace from previous location --- at SizeBench.AnalysisEngine.Session.Create(String binaryPath, String pdbPath, ILogger sessionLogger) at SizeBench.AnalysisEngine.Session.Create(String binaryPath, String pdbPath, ILogger sessionLogger) at SizeBench.AnalysisEngine.DiffSession.Create(String beforeBinaryPath, String beforePdbPath, String afterBinaryPath, String afterPdbPath, ILogger sessionLogger) at SizeBench.GUI.SessionFactory.CreateDiffSession(String beforeBinaryPath, String beforePdbPath, String afterBinaryPath, String afterPdbPath, ILogger logger) at SizeBench.GUI.ViewModels.MainWindowViewModel.<>c__DisplayClass53_0.<b__0>d.MoveNext() ```

I take it SizeBench doesn't really support .NET Native binaries after all.

Anton mentioned there might be some ilc.exe options to dump the dependenty graph, I'll see if I can find someone that remembers how to use that. Maybe @MichalStrehovsky? Otherwise I can also ask Tom once he's back πŸ™‚

MichalStrehovsky commented 1 year ago

I responded on Teams but I looked at the exception you're hitting now and have a separate idea.

SizeBench works fine with NativeAOT and it's not that different. SizeBench is open source. If you're determined enough, you could probably build it from source and avoid this exception:

https://github.com/microsoft/SizeBench/blob/91c1cfe66f2318a842363a4a2e65aa12f3036caa/src/SizeBench.AnalysisEngine/PE/EHSymbolParser.cs#L326-L328

By replacing it with return. The method does a return for MASM or clang a couple lines above, so it should be fine to just return from here and still get usable results.

Sergio0694 commented 1 year ago

I've managed to build and run SizeBench locally with that fix, thank you! πŸ˜„ I'm seeing this:

image

Also loading the symbols diff for that, but it's been loading for over 3 hours now and still 80% through πŸ˜† Will look into that too as soon as it's finished.

EDIT: it finished! And I managed to export all diff table to excel: here's a direct link (need a MSFT account to open this). @jkotas @MichalStrehovsky would you be able to take a look and see if anything in particular jumps to your eyes here, whenever you have a spare minute? You're definitely way more familiar with this kind of data than I am πŸ˜…

Looking at this I'm not really sure I understand this. The top diff is like 11KB and then there's a bunch of 2KB diffs. I don't see how these would add up to 3.3 MB at all (which is the size diff for the uncompressed Arm63 binary, see table below).

image

EDIT2: thanks to @tannergooding for pointing this out, it seems the deltas in the diff are just "inverted", ie. with STJ7 being the baseline, so the size increase is actually a negative value in the spreadsheet. If you sort the diffs by lower to higher, and sum all the negative diffs, you get ~4.5MB of size regression. If you then add that ~1.2MB improvement, you end up with that ~3.4 MB regression for the Arm64 binary, which seems to add up.

Note: the diff for that SizeBench report was calculated from a build I did yesterday between 7.0.2 and 8.0.0-preview.4.23216.5, it's not using the same 8.0.0-preview1 custom build as the other data below in this message.


In the meantime, I've also started to gather some data points by cherry-picking Jan's fix in #84899 into the various 8.0.0 Preview 1, 2, 3 branches, and kicking off new pipelines to try to narrow down the range of changes that introduced this regression. So far the first two pipelines have finished, and it seems the regression is already there since Preview 1. Specifically:

System.Text.Json version Microsoft Store package size
7.0.2 66.409 KB
8.0.0-preview1 70.369 KB

This is actually ~350 KB worse than with 8.0.0-preview.4.23216.5, I guess some of the new changes there (such as #84411) did improve things a little bit at least. It would seem the culprit then is somewhere in https://github.com/dotnet/runtime/compare/v8.0.0-preview.1.23110.8...v7.0.2. It seems there's a total of 461 commits, though most are not about System.Text.Json changes, so with some luck it should be possible to narrow this down to some specific change that look suspicious? I can then build from there and calculate new diffs to validate.

EDIT: actually I'm not 100% sure that commits delta is right, some things don't really add up πŸ€”

For additional context, here's the size diff for the uncompressed binaries extracted from the package:

System.Text.Json version CPU arch Microsoft Store package size
7.0.2 x64 48.633 KB
8.0.0-preview1 x64 57.080 KB
7.0.2 arm64 53.137 KB
8.0.0-preview1 arm64 56.852 KB

As expected, the difference is even more noticeable here. The x64 binary in particular is 17.4% bigger. To clarify this is exactly the same version of the app, the only difference is bumping the System.Text.Json dependency.

Does anyone have any ideas, maybe @eiriktsarpalis?

Also could we add the partner-impact tag to the issue? A 3.4 MB regression for no apparent reason and with no gain is kinda hard to justify (ie. it's not like when we moved to System.Text.Json, where we could say the increase in binary size at least gave us less memory use, faster performance and made the code trim-friendly), and the Store package is directly part of the Windows ISO, so trying to minimize size is important for us (which is why I'm dedicating all this time to tracking things) πŸ™‚

Sergio0694 commented 1 year ago

Sharing more details as I keep investigating this. Will update this comment with more data as I go πŸ˜„

Looking at the SizeBench diffs (link here), there seem to be these main offenders showing up in the top results:

I'm still not entirely sure how to check diffs, as the commits range seems to be broken in the runtime (@tannergooding mentioned it's possibly due to the repo having more than one tree). Additionally, some of the types I'm seeing in the diff (such as ReflectionJsonTypeInfo<T> don't appear to exist at all anywhere (can't find them on source.dot.com and I can't find them by opening the repo from main with VS Code either, which is a bit odd). Regardless, these PRs seem to be potentially related:

Which are not showing up in that commits range though, so not entirely sure what's up with the git history here (as in, those PRs are from July/August 2023, so they should be there..?). It just seems some change is now causing a whole lot of additional (and useless) reflection stuff to be rooted even though you're only ever using the source generators, such in our case.

cc. @eiriktsarpalis @krwq as you've both worked on those various PRs πŸ™‚

EDIT: actually these might be fine, as the issue was introduced after November 23rd, 2023, see below.

EDIT 2: this one from November 25th is suspect, looking into it.

EDIT 3: well it seems that's not it, I built 22575 right from that commit and the binary size is fine (see table below).


I'm still investigating and running pipelines Γ  la git bisect to try to narrow down what PR exactly introduced the issue. I'm hoping once we figure that out, we might come up with a way to shuffle the code a bit to stop rooting stuff downlevel.

Here's what I have so far:

System.Text.Json version Package size x64 binary size Commit
7.0.2 66.409 KB 48.633 KB Public feed
8.0.0-alpha.1.22451.2 66.438 KB 48.603 KB Nightly feed
8.0.0-alpha.1.22570.1 66.366 KB 48.561 KB Nightly feed
8.0.0-alpha.22575 66.389 KB 48.580 KB https://github.com/Sergio0694/runtime/commit/58f59db3f3be886d6c98c9ca6b4ab5fd8b8edda7
8.0.0-alpha.22605.1 66.382 KB 48.579 KB https://github.com/Sergio0694/runtime/commit/d47a24c1c241cfe54247b452bb65aaa26fbc906d
8.0.0-alpha.22605.2 66.386 KB 48.579 KB https://github.com/Sergio0694/runtime/commit/a59eae58dad85ca1f2dd130c5fc97a43b2b0459d
8.0.0-alpha.22605.3 67.846 KB 50.756 KB https://github.com/Sergio0694/runtime/commit/3559d339d4989039be8c3cb37e115bdad2f76c93
8.0.0-alpha.22610 67.841 KB 50.756 KB https://github.com/Sergio0694/runtime/commit/cc3f5686f6b849ef8cb0aaca771f5933b5cc9e70
8.0.0-alpha.22612.1 67.843 KB 50.756 KB https://github.com/Sergio0694/runtime/commit/57a48c6b92d6f858661c53aed84c2a53778f58ed
8.0.0-alpha.22612.2 72.863 KB 60.861 KB https://github.com/Sergio0694/runtime/commit/853e7e3662a3176a04069ae09d15b65dade5d706
8.0.0-alpha.22615 72.866 KB 60.862 KB https://github.com/Sergio0694/runtime/commit/a46dd510e263b62fa26a553fed05569a8372037a
8.0.0-alpha.23053 72.867 KB 60.862 KB https://github.com/Sergio0694/runtime/commit/5f8fc6748c7cfaf6a4a7fa4461914d4831def812
8.0.0-preview1 70.369 KB 57.080 KB https://github.com/dotnet/runtime/commit/7deac7d6da3ce4e96d51d5f61a27ee85b7c42f60
8.0.0-preview2 70.027 KB 56.719 KB https://github.com/dotnet/runtime/commit/30b879924a47d8660d5f4b14ff581ce40ae076f2
8.0.0-preview3 70.055 KB 56.730 KB https://github.com/dotnet/runtime/commit/47bad717bd69883ec8f590ffbbfad63c868cbe55
8.0.0-preview.4.23218.1 70.010 KB 56.679 KB Nightly feed

Note: credits to @tannergooding for finding the formula for the nightly builds, you can reconstruct the exact date a commit is from by doing: yy * 1000 + 50 * mm + dd. So eg. 8.0.0-alpha.1.22570.1 is November 20th, 2022.

It seems the issue is somewhere between 8.0.0-alpha.1.22570.1 and 8.0.0-preview1. Unfortunately that range is after #78741, so I'll have to keep cherry-picking #84899 manually and build from source, so it'll take a bit longer to gather more results. But, working on it. This does seem very promising so far, we're narrowing this down a lot πŸŽ‰

EDIT: 8.0.0-alpha.23053 (built from https://github.com/Sergio0694/runtime/commit/5f8fc6748c7cfaf6a4a7fa4461914d4831def812) is very interesting. The binary size is actually even worse here than it is in Preview 1. It seems there's been some change that severely regressed the size between 22575 (built from https://github.com/Sergio0694/runtime/commit/58f59db3f3be886d6c98c9ca6b4ab5fd8b8edda7) and 23053, which then improved again by the time 8.0.0 Preview 1 came out, but never really managed to get back to the previous baseline. Definitely seems like we're narrowing down on what is causing this though.

EDIT 2: it seems the break is between 8.0.0-alpha.22605 (built from https://github.com/Sergio0694/runtime/commit/d47a24c1c241cfe54247b452bb65aaa26fbc906d) and 8.0.0-alpha.22615 (built from https://github.com/Sergio0694/runtime/commit/a46dd510e263b62fa26a553fed05569a8372037a). These two seem suspect and I'm inclined to say one of the two is likely the culprit here:

Sergio0694 commented 1 year ago

Well, this was not painful at all, but we did narrow down exactly which PRs regressed the size! πŸŽ‰πŸŽ‰πŸŽ‰

See table in the previous message for full breakdown and deltas between a whole lot of commits. Specifically:

System.Text.Json version Package size x64 binary size Commit
7.0.2 66.409 KB 48.633 KB Public feed
8.0.0-alpha.22605.3 67.846 KB 50.756 KB https://github.com/Sergio0694/runtime/commit/3559d339d4989039be8c3cb37e115bdad2f76c93
8.0.0-alpha.22612.2 72.863 KB 60.861 KB https://github.com/Sergio0694/runtime/commit/853e7e3662a3176a04069ae09d15b65dade5d706

That is, these two PRs seem to be causing the regressions:

@jkotas @MichalStrehovsky do you see anything in particular that jumps to your eyes and might explain the very noticeable worse binary size in the changes for those two PRs? Especially in the second one. I will also try to spend some time tomorrow going through the changes in those two PRs in particular (haven't had time to do that just yet).

Still, this seems like a very good first step trying to figure out what's going on and hopefully how to fix it πŸ˜„

ghost commented 1 year ago

Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr, @marek-safar See info in area-owners.md if you want to be subscribed.

Issue Details
### Description Now that #84895 has been fixed, I was able to start dogfooding System.Text.Json 8.0.0 builds in the Microsoft Store, to gather some numbers so we can more easily track the binary size difference with this new release. This is where I've noticed an problem: my understanding is that the new 8.0.0 release is expected to include several binary size improvements, which should make the update result in a _smaller_ binary size footprint (see #79122), but I'm actually seeing a regression, and not even a trivial one πŸ₯² ### Configuration | System.Text.Json version | Microsoft Store package size | | --- | --- | | 7.0.2 | 66.410 KB | | 8.0.0-preview.4.23216.5 | **70.015 KB** | Delta: **+3.605 KB** (**+5.4%**). The real delta for the x64 binary is actually ~8.4 MB (ie. ~17%), see details below. ### Regression? Yes. The binary size is actually, surprisingly, lower on 7.0.2. cc. @eiriktsarpalis @layomia I assume there's definitely something wrong going on here? Ie. my assumption would be that updating to 8.0.0 should definitely result in a net _decrease_ of binary size, no? πŸ€” Happy to help test any preview bits necessary and/or to provide more info (internally if needed, as the project is not public).
Author: Sergio0694
Assignees: -
Labels: `area-System.Text.Json`, `tenet-performance`, `linkable-framework`, `partner-impact`
Milestone: 8.0.0
jkotas commented 1 year ago

@jkotas @MichalStrehovsky do you see anything in particular that jumps to your eyes

I do not see anything obvious.

eerhardt commented 1 year ago

Can https://github.com/MichalStrehovsky/sizoscope be used to investigate here? There is a diff'ing capability in it.

Sergio0694 commented 1 year ago

Unfortunately that's only for NativeAOT and doesn't support .NET Native, which is what the Microsoft Store uses πŸ₯² FWIW, I did include a full diff using SizeBench though, in case it helps. You can find it here.

As suggested by @eiriktsarpalis though, I can try seeing if I can create a standalone repro for this. If I can keep all code for the repro in just a .NET Standard 2.0 lib, we could then compile that both with .NET Native and NativeAOT. Assuming the regression also repros on NativeAOT, we'd then be able to use sizoscope there. If not, well, worth a try, and it might still provide us with more useful info to further investigate this. If anything else, it'd make testing things much faster than building the Store πŸ˜„

Sergio0694 commented 1 year ago

I have some new results with the latest nightly, as @eiriktsarpalis asked to re-run SizeBench with it as well.

System.Text.Json version Package size x64 binary size
7.0.2 66.407 KB 48.619 KB
8.0.0-preview.4.23218.16 69.973 KB 56.664 KB

Deltas:

Full SizeBench diff: click here (MSFT internal only).


Also trying to get some trimming/reflection logs from ILC in the meantime.

eiriktsarpalis commented 1 year ago

I'm seeing the JsonTypeInfo<T>.SerializeAsync async method popping up a lot in the spreadsheet which was in fact introduced in #78646. This is an instance method replacing the equivalent static JsonSerializer.WriteStreamAsync<T> method. Would the fact that a static async method has been converted to an instance method have any impact on trimmability?

Sergio0694 commented 1 year ago

Alright I have lots more (hopefully) useful info πŸ˜„ I managed to create a minimal, self-contained repro. I can basically replicate with either:

Same if I use any of the Serialize[Async] methods. Basically as soon as you do any JSON serialization, this reproduces. In my minimal repro, the delta between 7.0.2 and 8..0.0 is 45% (3.92 MB ---> 5.69 MB). Adding types brings it up even more.


This is all that's needed to reproduce this, in a blank UWP app with a System.Text.Json dependency:

Repro code (click to expand):
```csharp public sealed partial class MainPage : Page { public MainPage() { this.InitializeComponent(); ProcessData(JsonSerializer.Serialize(new MyModel(), MyJsonSerializerContext.Default.Options)); } [MethodImpl(MethodImplOptions.NoInlining)] private static void ProcessData(object jsonModel) { Console.WriteLine(jsonModel.ToString()); Trace.WriteLine(jsonModel.ToString()); } } public sealed class MyModel { public string Foo { get; set; } public int Bar { get; set; } } [JsonSerializable(typeof(MyModel))] [JsonSourceGenerationOptions(GenerationMode = JsonSourceGenerationMode.Metadata)] public sealed partial class MyJsonSerializerContext : JsonSerializerContext { } ```

Here's the full source code for the repro project: JsonBinarySizeRepro.zip.

Note: for MSFT only, here's a link to a .zip with all intermediate ILC files both with 7.0.2 and 8.0.0. For everyone else, you can get that same info on your machine by just opening that project, building in Release x64, saving the intermediate output, then closing, changing the System.Text.Json dependency to 7.0.2, and rebuilding.

A few notable things I've noticed looking at the ILTransformed files, check out these diffs:

I'm hoping this minimal repro and also having all the ILC logs might allow us to figure out what's exactly rooting stuff. I might be wrong but it seems like there's something in 8.0.0 that's causing the trimmer to assume that a whole lot of System.Text.Json stuff will be accessed through reflection, which then causes it to preserve a ton more stuff compared to 7.0.2, along with all metadata and also compiled code for all those bits? But like... Why? πŸ₯²

@jkotas @MichalStrehovsky whenever you have some time, would you be able to take a look at those intermediate files from the ILC dumps to see if you notice anything that makes sense to you to explain what's going on? It's all in that .zip link, and I've also linked the CreateMetadataInput.log and reflectionlog.csv files from both versions directly in the list above for convenience.

Thank you!

jkotas commented 1 year ago

Why would a MCG interop file be any different across System.Text.Json versions?

MCG scans for everything that may need WinRT marshalling and generates the marshaller for everything that it can find. For example, implementing IEnumerable<Something> somewhere can make MCG think that IEnumerable<Something> may be passed to WinRT API and generate a ton more marshalling code - even when IEnumerable<Something> is never actually passed to any WinRT API, but it is typically impossible for MCG prove that.

jkotas commented 1 year ago

I am not able to tell from these logs where things are rooted. However, there seems to be infinite generic recursion that involves collection types. The logs have types like:

System.Text.Json.Serialization.Metadata.JsonPropertyInfo<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.IList>>>>>>>>>

eiriktsarpalis commented 1 year ago

there seems to be infinite generic recursion

Most likely related to this method:

https://github.com/dotnet/runtime/blob/614d68307c5b41d1d77a081d747c385b03ac8663/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Metadata/JsonTypeInfoOfT.ReadHelper.cs#L82-L141

However this would only be rooted by a call to one of the JsonSerializer.DeserializeAsyncEnumerable methods.

jkotas commented 1 year ago

Here is a small repro with regular native AOT with .NET 8 P3:

using System.Text.Json;
using System.Text.Json.Serialization.Metadata;

JsonSerializer.DeserializeAsyncEnumerable<Queue<string>>(new MemoryStream(), (JsonTypeInfo<Queue<string>>)null!);

.csproj:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <PublishAot>true</PublishAot>
    <TrimmerSingleWarn>false</TrimmerSingleWarn>
  </PropertyGroup>

</Project>

Publishing with dotnet publish -c Release will produce warnings like: ILC : AOT analysis warning IL3054: [System.Text.Json]System.Text.Json.Serialization.Metadata.JsonTypeInfo<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<System.Collections.Generic.Queue<string>>>>>: Generic expansion to 'System.Text.Json.Serialization.Metadata.JsonTypeInfo<Queue<Queue<Queue<Queue<Queue<String>>>>>>' was aborted due to generic recursion. An exception will be thrown at runtime if this codepath is ever reached. Generic recursion also negatively affects compilation speed and the size of the compilation output. It is advisable to remove the source of the generic recursion by restructuring the program around the source of recursion. The source of generic recursion might include: 'System.Text.Json.Serialization.Metadata.JsonTypeInfo', 'System.Text.Json.Serialization.JsonConverter', 'System.Text.Json.Serialization.JsonCollectionConverter', 'System.Text.Json.Serialization.JsonResumableConverter', 'System.Text.Json.Serialization.JsonDictionaryConverter', 'System.Text.Json.Serialization.JsonConverter.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.JsonConverterFactory.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.JsonConverter.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.Converters.CastingConverter', 'System.Text.Json.Serialization.Converters.JsonMetadataServicesConverter', 'System.Text.Json.Serialization.Metadata.DefaultJsonTypeInfoResolver.DeterminePropertyAccessors<T>(JsonPropertyInfo<T>,MemberInfo)', 'System.Text.Json.Serialization.Metadata.JsonPropertyInfo', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo.<DeserializeAsync>d__1', 'System.Text.Json.Serialization.Converters.QueueOfTConverter', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo.<DeserializeAsObjectAsync>d__6', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo.<SerializeAsync>d__11', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo.<<DeserializeAsyncEnumerable>g__CreateAsyncEnumerableDeserializer|4_0>d', 'System.Text.Json.Serialization.Metadata.JsonParameterInfo', 'System.Text.Json.Serialization.Converters.IEnumerableDefaultConverter'

Sergio0694 commented 1 year ago

@eiriktsarpalis I'm wondering whether that instance field JsonTypeInfo<Queue<T>> in all type infos might be triggering the problem. Especially because eg. MCG runs in a different part of the toolchain, before trimming, so it's possible that part is generating all that garbage (the extra WinRT interop code) that then becomes implicitly rooted, if that makes sense πŸ˜…

It's interesting though that NativeAOT also has some kind of issues in that area, though it seems in order to trigger them you need to explicitly use that Queue<T> type yourself. Whereas on .NET Native just doing any (de)serialization is enough.

Would it be possible to eg. change that instance field to just be a non-generic version, and then just directly cast from within that method when needed? Ie. just have the field be a JsonTypeInfo, and do a direct cast only at the use site πŸ€”

If it helps, I guess I can also try wiggling that code, then build from source and try out that package with the repro project.

jkotas commented 1 year ago

It's interesting though that NativeAOT also has some kind of issues in that area, though it seems in order to trigger them you need to explicitly use that Queue

You do not have to use Queue explicitly. It repros with any reference type. For example, this will produce the warning with native AOT as well:

using System.Text.Json;
using System.Text.Json.Serialization.Metadata;

JsonSerializer.DeserializeAsyncEnumerable<List<string>>(new MemoryStream(), (JsonTypeInfo<List<string>>)null!);
Sergio0694 commented 1 year ago

This definitely seems like it could explain the massive binary size increase on .NET Native. Looking at the MCG-generated files (Data.g.cs, ImplTypes.g.cs), there's like... Hundreds of pre-generated CCWs (for projected enumerators) and WinRT marshalling stubs created for all those combinations of nested enumerable generics, woah 😳

image

jkotas commented 1 year ago

@eiriktsarpalis Could you please look into this? It is complicated generic cycle that is caused by special-casing of Queue<T>. I am not sure about all reasons that lead us to build this special-casing. The fixes for these sort of cycles typically involve introducing a bit of code duplication to break the cycle (e.g. introducing a small InternalQueue type that is used internally by System.Text.Json).

Sergio0694 commented 1 year ago

Curious - now that we likely have identified the cause of the regression, do the changes in https://github.com/dotnet/runtime/pull/78646 and https://github.com/dotnet/runtime/pull/79397 make more sense as to how they managed to trigger the issue while looking just fine on their own? Or is it just more of a coincidence that they just happened to exacerbate the underlying issue of that generic recursion that was already there from previous PRs? πŸ€”

jkotas commented 1 year ago

do the changes in https://github.com/dotnet/runtime/pull/78646 and https://github.com/dotnet/runtime/pull/79397 make more sense as to how they managed to trigger the issue while looking just fine on their own?

Yes, #78646 introduced the generic cycle. #79397 added more code that gets multiplied by the generic cycle before the compiler decided to cut it off and print warning.

eiriktsarpalis commented 1 year ago

FWIW the dependency on Queue<T> for IAsyncEnumerable deserialization dates from .NET 6. What I think has happened here is that https://github.com/dotnet/runtime/pull/78646 moved the helper from a JsonSerializer static method to a JsonTypeInfo instance method and https://github.com/dotnet/runtime/pull/79397 added weakly typed JsonSerializer.DeserializeAsyncEnumerable overloads which presumably results in the instance method being rooted for every JsonTypeInfo<T> specialization, triggering the recursive condition.

Sergio0694 commented 1 year ago

Results in the Store with a new build from https://github.com/dotnet/runtime/pull/85176 (https://github.com/dotnet/runtime/pull/85176/commits/f10bc02066a29fb225b51c4bf98cf24b535b3762):

System.Text.Json version Package size x64 binary size
7.0.2 66.416 KB 48.633 KB
8.0.0-alpha.23221 64.767 KB 45.788 KB

Deltas:

We made it!! πŸŽ‰πŸŽ‰πŸŽ‰ Thank you so much for all the help @jkotas @eiriktsarpalis @MichalStrehovsky! πŸ˜„