Consider splitting up `System.Text.Json` tests to allow running them completely with wasm/AOT

radical commented 1 year ago

Running System.Text.Json.Tests with wasm/AOT crashes V8, and chrome. Issue: https://github.com/dotnet/runtime/issues/86164

[22:06:45] info: [STRT] System.Text.Json.Serialization.Tests.StreamTests_DeserializeAsyncEnumerable.DeserializeAsyncEnumerable_ReadSimpleObjectAsync(count: 1000, bufferSize: 1000)
[22:06:56] info: 
[22:06:56] info: <--- Last few GCs --->
[22:06:56] info: 
[22:06:56] info: [70:0x560a9068c280]    63266 ms: Mark-Compact (reduce) 2048.2 (2071.2) -> 2048.2 (2071.2) MB, 667.36 / 0.00 ms  (average mu = 0.728, current mu = 0.001) allocation failure; scavenge might not succeed
[22:06:56] info: 
[22:06:56] info: 
[22:06:56] info: <--- JS stacktrace --->
[22:06:56] info: 
[22:06:56] info: 
[22:06:56] info: #
[22:06:56] info: # Fatal JavaScript out of memory: Reached heap limit
[22:06:56] info: #
[22:06:56] info: 
[22:07:23] info: /home/helixbot/.jsvu/bin/v8: line 2:    70 Trace/breakpoint trap   (core dumped) "/home/helixbot/.jsvu/engines/v8/v8" --snapshot_blob="/home/helixbot/.jsvu/engines/v8/snapshot_blob.bin" "$@"
[22:07:23] info: Process v8 exited with 133

We run into limits while trying to run this test suite, and try to work around that in different ways. It would be extremely useful to possibly run the tests in parts. But I'm not sure how we could split this up. We could either split this into multiple projects if there is a good logical boundary for that. Or we could pick handful of big classes, and explicitly run them in a separate helix job.

cc @kg @pavelsavara for more details on the issues we are running into. cc @eiriktsarpalis

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.

Issue Details

Running `System.Text.Json.Tests` with wasm/AOT crashes V8, and chrome. Issue: https://github.com/dotnet/runtime/issues/86164 ``` [22:06:45] info: [STRT] System.Text.Json.Serialization.Tests.StreamTests_DeserializeAsyncEnumerable.DeserializeAsyncEnumerable_ReadSimpleObjectAsync(count: 1000, bufferSize: 1000) [22:06:56] info: [22:06:56] info: <--- Last few GCs ---> [22:06:56] info: [22:06:56] info: [70:0x560a9068c280] 63266 ms: Mark-Compact (reduce) 2048.2 (2071.2) -> 2048.2 (2071.2) MB, 667.36 / 0.00 ms (average mu = 0.728, current mu = 0.001) allocation failure; scavenge might not succeed [22:06:56] info: [22:06:56] info: [22:06:56] info: <--- JS stacktrace ---> [22:06:56] info: [22:06:56] info: [22:06:56] info: # [22:06:56] info: # Fatal JavaScript out of memory: Reached heap limit [22:06:56] info: # [22:06:56] info: [22:07:23] info: /home/helixbot/.jsvu/bin/v8: line 2: 70 Trace/breakpoint trap (core dumped) "/home/helixbot/.jsvu/engines/v8/v8" --snapshot_blob="/home/helixbot/.jsvu/engines/v8/snapshot_blob.bin" "$@" [22:07:23] info: Process v8 exited with 133 ``` We run into limits while trying to run this test suite, and try to work around that in different ways. It would be extremely useful to possibly run the tests in parts. But I'm not sure how we could split this up. We could either split this into multiple projects if there is a good logical boundary for that. Or we could pick handful of big classes, and explicitly run them in a separate helix job. cc @kg @pavelsavara for more details on the issues we are running into. cc @eiriktsarpalis

Author:	radical
Assignees:	-
Labels:	`area-System.Text.Json`, `test-enhancement`
Milestone:	-

kg commented 1 year ago

S.T.J is also very close to the limit in the wasm interpreter configuration. If you raise the Jiterpreter limit by a few more megabytes the test suite will also hit V8's heap size limit and crash. So I think we've been just barely staying below this 2GB threshold for a while.

eiriktsarpalis commented 1 year ago

In order to understand the problem correctly, is this an issue of assembly size, number of tests, or specific tests having a very deep stack?

MichalStrehovsky commented 1 year ago

@eiriktsarpalis this is also a problem for Native AOT. The Json tests have been failing in the runtime-extra-platforms runs ever since they got enabled:

artifacts/bin/coreclr/linux.arm64.Release/build/Microsoft.NETCore.Native.targets(271,5): error MSB3073: (NETCORE_ENGINEERING_TELEMETRY=Build) The command ""/__w/1/s/artifacts/bin/coreclr/linux.arm64.Release/x64/ilc/ilc" @"/__w/1/s/artifacts/obj/System.Text.Json.SourceGeneration.Roslyn4.4.Tests/Release/net8.0/native/System.Text.Json.SourceGeneration.Roslyn4.4.Tests.ilc.rsp"" exited with code 137

It's an OOM. The test does compile for me locally but takes forever, consumes gigabytes of memory, and compiles into a 150 MB executable with a 1.5 GB PDB. That's rather large.

There are three issues:

Generic recursion. The compiler does detect this and cuts it off, but it still causes quite a bit of damage because the cutoff only kicks in for deep nesting to prevent compiling forever/until memory runs out. It also produces warnings about this but unfortunately we disable them for test builds so it's not surfaced: https://github.com/dotnet/runtime/blob/10222f94e5d89b19959117dd9e5b8576ef63f878/eng/testing/tests.singlefile.targets#L35
There are tens of megabytes of JsonConverter<T> instances. Instances over value types are not shared and there's a lot of valuetypes being serialized. Some of them might not be intentional - for example ClassWithReadOnlyPropertyQueue says it's a class but it's in fact a struct - there are dozens of these ClassWith that are structs. Each unique JsonConverter<T> instance over a struct costs more than 100 kB in size in the test. The 90% of this 100+ kB is caused by:
CreateCastingConverter generic method. This is a generic virtual method on a generic type. This patterns causes an N * M generic expansion in AOT. The test ends up calling the method with about 250 unique T's (we only count valuetypes). This means that each of the JsonConverter<T> instances in the test get extra 250 specializations. There's hundreds of JsonConverter instances in the app so this multiplies into unreasonable numbers (tens of thousands of CreateCastingConverter method bodies) very quickly.

Getting rid of the generic recursion would be probably the best first step. If we can turn some of the structs in the test into classes, that would help too. The generic virtual method is a huge problem because of the recursion but it will probably scale pretty poorly for large JSON models in general.

I have reasons to believe the recursion is still around handling of Queue<T> because the warnings look like this:

ILC: AOT analysis warning IL3054: System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.<DeserializeAsyncEnumerable>d__4<Queue`1<Queue`1<Queue`1<Queue`1<SByte[]>>>>>: Generic expansion to 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1<Queue`1<Queue`1<Queue`1<Queue`1<Queue`1<SByte[]>>>>>>' was aborted due to generic recursion. An exception will be thrown at runtime if this codepath is ever reached. Generic recursion also negatively affects compilation speed and the size of the compilation output. It is advisable to remove the source of the generic recursion by restructuring the program around the source of recursion. The source of generic recursion might include: 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1', 'System.Text.Json.Serialization.JsonConverter`1', 'System.Text.Json.Serialization.JsonResumableConverter`1', 'System.Text.Json.Serialization.JsonDictionaryConverter`1', 'System.Text.Json.Serialization.JsonConverter.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.JsonConverterFactory.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.JsonConverter`1.CreateCastingConverter<TTarget>()', 'System.Text.Json.Serialization.Converters.CastingConverter`1', 'System.Text.Json.Serialization.Converters.JsonMetadataServicesConverter`1', 'System.Text.Json.Serialization.Metadata.DefaultJsonTypeInfoResolver.DeterminePropertyAccessors<T>(JsonPropertyInfo`1<T>,MemberInfo)', 'System.Text.Json.Serialization.Metadata.JsonPropertyInfo`1', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.<DeserializeAsync>d__1', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.<DeserializeAsyncEnumerable>d__4', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.<DeserializeAsObjectAsync>d__6', 'System.Text.Json.Serialization.Metadata.JsonTypeInfo`1.<SerializeAsync>d__10', 'System.Text.Json.Serialization.Metadata.JsonParameterInfo`1'

MichalStrehovsky commented 1 year ago

If you want to see the extent of it, dotnet tool install sizoscope --global --framework net7.0-windows, run sizoscope and open this file in the GUI:

System.Text.Json.SourceGeneration.Roslyn4.4.Tests.zip

You'll see something like this:

kg commented 1 year ago

In order to understand the problem correctly, is this an issue of assembly size, number of tests, or specific tests having a very deep stack?

AFAIK stack depth is not the problem here, for the wasm interp scenario the native heap as a whole is growing until it hits a limit. If I had to guess it's probably a mix of type info, generated interpreter code, jitted wasm code (for the interp+jiterp scenario), etc. I don't know offhand how to get measurements out of the mono runtime on this stuff but if you need measurements I can probably figure it out. In the V8 console scenario, the native heap has to approach 2GB of allocated space before this happens. I'm not sure why the wasm AOT version is running out of memory, to be honest - I would expect it to use less memory, but it's possible that loaded AOT binaries count against V8's heap size limit.

eiriktsarpalis commented 1 year ago

There are tens of megabytes of JsonConverter instances. Instances over value types are not shared and there's a lot of valuetypes being serialized. Some of them might not be intentional - for example ClassWithReadOnlyPropertyQueue says it's a class but it's in fact a struct - there are dozens of these ClassWith that are structs.

I think that's unavoidable given the nature of testing serializers, effectively we tend to define one new type per test and many of these tests need to include coverage for structs (to make sure that each facet of the serializer correctly accounts for struct semantics). I think we could try cutting a few struct definitions here and there when not strictly necessary from a test coverage perspective but I wouldn't expect huge gains from that exercise.

CreateCastingConverter generic method. This is a generic virtual method on a generic type. This patterns causes an N * M generic expansion in AOT.

I recently merged https://github.com/dotnet/runtime/pull/80755 that made changes to CreateCastingConverter out of similar size concerns, and it appears that this method no longer needs to defined on JsonConverter<T>. I'll follow-up with a PR.

eiriktsarpalis commented 1 year ago

@MichalStrehovsky new sizes after #87211 got merged.

eiriktsarpalis commented 1 year ago

With https://github.com/dotnet/runtime/pull/87276 it looks like the total size has dropped to 60 MB. Should we try re-enabling the tests once merged?

radical commented 1 year ago

With #87276 it looks like the total size has dropped to 60 MB. Should we try re-enabling the tests once merged?

I have opened https://github.com/dotnet/runtime/pull/87256 which we can try once #87276 is merged.

eiriktsarpalis commented 1 year ago

PR is merged, feel free to rebase your PR branch.

radical commented 1 year ago

With #87276 it looks like the total size has dropped to 60 MB. Should we try re-enabling the tests once merged?

It still fails:

[19:22:37] info: [STRT] System.Text.Json.Serialization.Tests.EnumConverterTests+UInt16EnumDictionary.SerilizeDictionaryWhenCacheIsFull
[19:22:43] info: 
[19:22:43] info: <--- Last few GCs --->
[19:22:43] info: 
[19:22:43] info: [92:0x563b327ef2f0]    89765 ms: Mark-Compact 2034.6 (2069.8) -> 2034.6 (2069.8) MB, 672.20 / 0.00 ms  (average mu = 0.913, current mu = 0.006) allocation failure; scavenge might not succeed
[19:22:43] info: [92:0x563b327ef2f0]    90441 ms: Mark-Compact 2037.5 (2072.8) -> 2037.5 (2072.8) MB, 674.53 / 0.00 ms  (average mu = 0.827, current mu = 0.002) allocation failure; scavenge might not succeed
[19:22:43] info: 
[19:22:43] info: 
[19:22:43] info: <--- JS stacktrace --->
[19:22:43] info: 
[19:22:43] info: 
[19:22:43] info: #
[19:22:43] info: # Fatal JavaScript out of memory: Reached heap limit
[19:22:43] info: #
[19:22:43] info: 
[19:23:11] info: /home/helixbot/.jsvu/bin/v8: line 2:    92 Trace/breakpoint trap   (core dumped) "/home/helixbot/.jsvu/engines/v8/v8" --snapshot_blob="/home/helixbot/.jsvu/engines/v8/snapshot_blob.bin" "$@"

pavelsavara commented 1 year ago

Adding pre-test yield to few test classes will allow JS engine to run timer loops. It may give our GC chance to run.

xunit syntax is

        public async Task InitializeAsync()
        {
            await Task.Yield();
        }

But v8 timer loop is strange, still worth trying, I think.

lewing commented 1 year ago

cc @BrzVlad @vargaz

vargaz commented 1 year ago

Did some experiments: Running with -s MAXIMUM_MEMORY=1073741824 (1GB) succeeds. Running with -s -s MAXIMUM_MEMORY=536870912 (512mb) fails in a controller manner:

  [40m[32minfo[39m[22m[49m: [STRT] System.Text.Json.Serialization.Tests.StreamTests_AsyncWithSmallBuffer.VeryLargeJsonFileTest(payloadSize: 1000, ignoreNull: False, writeIndented: False)
  [40m[32minfo[39m[22m[49m: console.debug: [MONO] GC_MINOR: (Nursery full) time 3.62ms, stw 3.62ms promoted 3302K major size: 41984K in use: 33033K los size: 62016K in use: 51968K
  [40m[32minfo[39m[22m[49m: console.debug: [MONO] GC_MINOR: (Nursery full) time 3.92ms, stw 3.92ms promoted 3710K major size: 44608K in use: 36825K los size: 62016K in use: 51968K
  [40m[32minfo[39m[22m[49m: console.debug: [MONO] GC_MINOR: (Nursery full) time 3.67ms, stw 3.67ms promoted 3709K major size: 48128K in use: 40615K los size: 62016K in use: 51968K
  [41m[30mfail[39m[22m[49m: [FAIL] System.Text.Json.Serialization.Tests.StreamTests_AsyncWithSmallBuffer.VeryLargeJsonFileTest(payloadSize: 1000, ignoreNull: False, writeIndented: False)
  [40m[32minfo[39m[22m[49m: System.OutOfMemoryException : Out of memory
  [40m[32minfo[39m[22m[49m:    at System.Text.Json.JsonReaderHelper.TranscodeHelper(ReadOnlySpan`1 )
  [40m[32minfo[39m[22m[49m:    at System.Text.Json.JsonSerializer.WriteString[List`1](List`1& , JsonTypeInfo`1 )
  [40m[32minfo[39m[22m[49m:    at System.Text.Json.Serialization.Tests.StreamTests.VeryLargeJsonFileTest(Int32 payloadSize, Boolean ignoreNull, Boolean writeIndented)
  [40m[32minfo[39m[22m[49m:    at System.Text.Json.JsonReaderHelper.TranscodeHelper(ReadOnlySpan`1 )
  [40m[32minfo[39m[22m[49m:    at System.Text.Json.JsonSerializer.WriteString[List`1](List`1& , JsonTypeInfo`1 )

So it looks like the linear memory usage is somewhere between 512mb and 1gb. Not sure what causes v8 to consume 2gb of memory.

vargaz commented 1 year ago

I can reproduce this when running with v8 --max-old-space-size=2024.

lewing commented 1 year ago

@pavelsavara @kg are there js objects we could be leaking/retaining?

kg commented 1 year ago

Could this be heap growth? i.e. growing from 512mb to 1gb requires being able to allocate contiguously 512mb + 1gb in the v8 perm gen, and it's not big enough

pavelsavara commented 1 year ago

@pavelsavara @kg are there js objects we could be leaking/retaining?

That unit test doesn't do any JS interop. The only one is that we hold Task/Promise pair for the main method.

I'm thinking about emscripten's posix emulation leaking. If there are VFS files created as part of the test, they would consume JS heap. What else ?

And I guess I should better understand how we allocate/free memory. I don't know what #ifdef in major_free_swept_blocks means yet.

https://github.com/dotnet/runtime/blob/b9c99648783d0c28afdee394a93516c0bd228165/src/mono/mono/sgen/sgen-marksweep.c#L2129-L2138

BrzVlad commented 1 year ago

A major GC block has 16kb and when we allocate blocks we allocate them in chunks (of 32 blocks at a time). Each block will be in a free list and will be used by the GC when needed. After a collection, in major_free_swept_blocks we attempt to free blocks also in various chunk sizes, that is not identical to the original allocation. So we attempt to unmap in the middle of mapped memory, which I know doesn't work on Windows. Not sure how much benefit we get for this complication of avoiding to call into OS map/unmap.

pavelsavara commented 1 year ago

To do more testing see https://github.com/dotnet/runtime/pull/88825#issuecomment-1634029173

eiriktsarpalis commented 1 year ago

It's unlikely we'll be able to address this in .NET 8, moving to future.

pavelsavara commented 1 year ago

From https://github.com/dotnet/runtime/pull/88825#issuecomment-1634029173

It would be useful to get some GC log for those offending test suites. Env: MONO_LOG_LEVEL=debug and MONO_LOG_MASK=gc

lewing commented 9 months ago

cc @vitek-karas @artl93

closing as resolved but the large trim unfriendly test assemblies are still very much a thing

dotnet / runtime

Consider splitting up `System.Text.Json` tests to allow running them completely with wasm/AOT #87078