dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.19k stars 4.72k forks source link

JsonSerializer.Deserialize is intolerably slow in Blazor WebAssembly, but very fast in .NET Core integration test #40386

Closed szalapski closed 1 year ago

szalapski commented 4 years ago

In my Blazor app, I have a component that has a method like this. (I've replaced a call to GetFromJsonAsync with code from inside it, to narrow down the slow part.)

  private async Task GetData()
  {
      IsLoading = true;
      string url = $".../api/v1/Foo";  // will return a 1.5 MB JSON array
      var client = clientFactory.CreateClient("MyNamedClient");

      Console.WriteLine($"starting");

      List<Foo> results;

      Task<HttpResponseMessage> taskResponse = client.GetAsync(url, HttpCompletionOption.ResponseContentRead, default);

      var sw = Stopwatch.StartNew();
      using (HttpResponseMessage response = await taskResponse)
      {

        response.EnsureSuccessStatusCode();
        var content = response.Content!;

        if (content == null)
        {
          throw new ArgumentNullException(nameof(content));
        }

        string contentString = await content.ReadAsStringAsync();

        sw.Stop();
        Console.WriteLine($"Read string: {sw.Elapsed}");
        sw.Restart();

        results = System.Text.Json.JsonSerializer.Deserialize<List<Foo>>(contentString)!;
        //results = Newtonsoft.Json.JsonConvert.DeserializeObject<List<Foo>>(contentString); // comparable

      }

      sw.Stop();
      Console.WriteLine($"Deserialize: {sw.Elapsed}");

      StateHasChanged();
      IsLoading = false;

My download of 2-6 MB takes 1-6 seconds, but the rest of the operation (during which the UI is blocked) takes 10-30 seconds. Is this just slow deserialization in ReadFromJsonAsync (which calls System.Text.Json.JsonSerializer.Deserialize internally), or is there something else going on here? How can I improve the efficiency of getting this large set of data (though it isn't all that big, I think!)

I have commented out anything bound to Results to simplify, and instead I just have an indicator bound to IsLoading. This tells me there's no slowness in updating the DOM or rendering.

When I attempt the same set of code in an automated integration test, it only takes 3 seconds or so (the download time). Is WebAssembly really that slow at deserializing? If so, is the only solution to retrieve very small data sets everywhere on my site? This doesn't seem right to me. Can this slowness be fixed?

Here's the resulting browser console log from running the above code:

VM1131:1 Fetch finished loading: GET "https://localhost:5001/api/v1/Foo".
Read string: 00:00:05.5464300
Deserialize: 00:00:15.4109950
L: GC_MAJOR_SWEEP: major size: 3232K in use: 28547K
L: GC_MAJOR: (LOS overflow) time 18.49ms, stw 18.50ms los size: 2048K in use: 187K
L: GC_MINOR: (LOS overflow) time 0.33ms, stw 0.37ms promoted 0K major size: 3232K in use: 2014K los size: 2048K in use: 187K

Using Newtonsoft.Json (as in the commented-out line) instead of System.Text.Json gives very similar results.

For what it's worth, here's the Chrome performance graph. The green is the download and the orange is "perform microtasks", which I assume means WebAssembly work.

enter image description here

szalapski commented 4 years ago

See also comments at https://stackoverflow.com/questions/63254162

Dotnet-GitSync-Bot commented 4 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

mkArtakMSFT commented 4 years ago

Thanks for contacting us. There has been a big push to optimize this further. You can learn more about the improvements https://github.com/dotnet/runtime/discussions/40318

mkArtakMSFT commented 4 years ago

@steveharter FYI

szalapski commented 4 years ago

Thanks. The performance is so poor that I am still skeptical that this is just a slow area--I still suspect that something is wrong with the way I am doing it. Would a deserialization of a few megabytes take 10-30 s?

szalapski commented 4 years ago

needs tag area-System.Text.Json

lewing commented 4 years ago

Which version of Blazor are you using? You'll want to avoid creating a string from the content and use a Stream instead. If you are using the net5.0 you should look at the System.Net.Http.Json extensions.

szalapski commented 4 years ago

I'm using Blazor 3.2.0 with System.Text.Json 5.0.0-preview.7.

Yes, I used the extensions, but when I saw they were slow, I refactored to the code above so I could narrow the issue down to serialization. Here's the code before my performance refactoring.

private async Task GetData()
{
      IsLoading = true;
      string url = $".../api/v1/Foo";
      Results = await clientFactory.CreateClient("MyNamedClient").GetFromJsonAsync<List<Foo>>(url);
      IsLoading = false;
}
steveharter commented 4 years ago

You'll want to avoid creating a string from the content and use a Stream instead.

Yes this will allow the deserializer to start before all of the data is read from the Stream and prevent the string alloc.

System.Text.Json should be ~2x faster for deserialization than Newtonsoft so it would be good to see your object model to see if you hit an area that is slow on STJ.

In either case, since both Newtonsoft and STJ are slow there is likely something else going on.

The Large object graph benchmark section in https://github.com/dotnet/runtime/discussions/40318 has deserialization perf of 372ms for a string of length 322K. This also includes a "polymorphic" mode due to using System.Object that causes deserialization to be much slower (almost 2x) than without it. Anyway, extrapolating 332K to your 1MB is a 3x factor, so I assume it would take about 372ms * 3 = ~1.1 seconds to deserialize (on my fast desktop in isolation).

Some thoughts:

HenkHolterman commented 4 years ago

I posted an MCVE as answer on StackOverflow, based on the WeatherForecast page. My results are much better: < 100ms download and < 4 seconds for deserialize.

See https://stackoverflow.com/q/63254162/

szalapski commented 4 years ago

@steveharter ,

Yes, Intel Core i5 8350-U with 16 GB RAM. The test is running on my laptop. Foo is actually as follows, with minimal name changes only to protect the proprietary. Nothing significant on the CPU, this is my only focus when I am doing this.

I did start with a stream per the code just above --that was how I found this issue. I refactored to the code in the issue post just to narrow it down to slowness in deserialization.

    public class Foo
    {
        public int? FooId { get; set; }
        public int? FiscalYear { get; set; }
        public string? FundTypeCode { get; set; }
        public string? CategoryDescription { get; set; }
        public string? CustomerLevelCode { get; set; }
        public string? CustomerCode { get; set; }
        public string? CustomerDescription { get; set; }
        public string? ChangedFieldName { get; set; }
        public decimal OriginalAmount { get; set; }
        public decimal NewAmount { get; set; }
        public string? Comment { get; set; }
        public string? CreateUser { get; set; }
        public DateTime CreateDate { get; set; }
        public bool AdjustedBySystem { get; set; }
        public Guid? ChangeBatchNumber { get; set; }
        public string? FundDescription { get; set; }
        public string? ParentCustomerCode { get; set; }
        public string? ParentCustomerLevel { get; set; }
        public string? ParentCustomerDescription { get; set; }
        public decimal ChangedAmount { get { return NewAmount - OriginalAmount; } }
        public bool? CreatedFromNFile { get; set; }
        public string? Reason => (FundTypeCode == "N" ? (CreatedFromNfdaFile == true ? "From N File" : "N Manual") : "")}

    }
steveharter commented 4 years ago

That model is simple and should be fast (no System.Object, non-generic collections or custom converters that could slow it down).

I suggest running the perf test that @HenkHolterman suggested in stackoverflow to compare against the baseline.

szalapski commented 4 years ago

@steveharter , I tried it just as suggested. It indeed takes 7-12 seconds to return 17000 items (about 1.6 MB) of WeatherForecast. (Download time on localhost is about 20 ms.) using the default code, await Http.GetFromJsonAsync<WeatherForecast[]>("WeatherForecast"); So this seems consistent with the timings on my slightly more complex case in the original question. (FYI, this is on Blazor 3.2.0; I also updated System.Text.Json via NuGet to v 5.0.0-preview.7, but it didn't help much.)

(also, I tried to increase the payload to 5 MB and that took 23-27 seconds.)

lewing commented 4 years ago

Blazor in net5 should be considerably faster. Are you running this test from inside VS or from a published build?

ghost commented 4 years ago

Tagging subscribers to this area: @CoffeeFlux See info in area-owners.md if you want to be subscribed.

szalapski commented 4 years ago

Running it from Visual Studio, "run without debugging" in Release configuration.

@lewing Do you mean just System.Text.Json should be faster? If so, I already have the latest preview.

Or are you suggesting I move the app to Blazor 5.0.0 latest preview?

lewing commented 4 years ago

@szalapski could you please try your timings with the a published app outside of VS? It looks like there is an issue where the runtime is always initialized in debug mode when run from inside VS.

Additionally if you update the app from 3.2 to 5.0 there are several interpreter optimizations and library improvements. Some are in preview7 some are in later builds.

szalapski commented 4 years ago

Just tried outside of VS -- using dotnet run at the command line in Windows. No problems, very similar timings.

Also tried running the .exe after running dotnet publish --configuration Release, just to be sure. (I think this should be virtually the same as running dotnet run, right?) The timings were again similar.

marek-safar commented 4 years ago

@lewing what are the next steps here?

steveharter commented 4 years ago

what are the next steps here?

I assume no attempt to run on Blazor 5.0 yet? If so I think that should be next.

Both Newtonsoft and STJ are slow. This indicates a likely environmental or systemic issue, and not likely a (de)serialization issue.

The StackOverflow test runs <4 seconds for @HenkHolterman and 7-12 seconds for @szalapski. Different hardware and\or different Blazor versions could account for that 2x-3x slowness; would need a standard CPU benchmark and same Blazor version to actually compare apples-to-apples.

Also @szalapski on download perf you originally said:

My download of 2-6 MB takes 1-6 seconds

but with your latest test from StackFlow you said:

It indeed takes 7-12 seconds to return 17000 items (about 1.6 MB) of WeatherForecast. (Download time on localhost is about 20 ms.)

So download time went from 1-6 seconds for 2-6MB to 20ms for 1.6MB -- any thought on why that's the case?

szalapski commented 4 years ago

The 1-6 seconds was over the internet, whereas the 20ms was running against a local web service. I just did that comparison to ensure that the download speed is not relevant--regardless of whether the download is 20 ms or 20,000 ms, the deserialization is quite slow.

I will try it on Blazor 5 preview 8 soon.

"The StackOverflow test runs <4 seconds for @HenkHolterman and 7-12 seconds for @szalapski. Different hardware and\or different Blazor versions could account for that 2x-3x slowness; would need a standard CPU benchmark and same Blazor version to actually compare apples-to-apples."

Why shouldn't it be on the order of tens of milliseconds? Are the optimizations we see in .NET Core just not possible in WebAssembly?

"Both Newtonsoft and STJ are slow. This indicates a likely environmental or systemic issue, and not likely a (de)serialization issue."

Wait, I thought all agreed that the slowness is in the deserialization code, not in a problem with my system or environment. You are saying that I have a problem that is not inherent to deserializing in WebAssembly? How can I diagnose that?

tareqimbasher commented 4 years ago

@szalapski I can confirm without a doubt that the slowness is with the deserialization and not a system or environment issue. We are developing a blazor wasm application and deserializing a ~1.8 MB JSON payload takes about 5-6 seconds (time to complete network request is not part of that time). We have a 20 member team and everyone from developers to business experiences this slowness of deserialization so I can state from experience that it is not a system or environment issue.

Interested to see how your times would change when you try it on Blazor 5 preview 8!

steveharter commented 4 years ago

The 1-6 seconds was over the internet, whereas the 20ms was running against a local web service. I just did that comparison to ensure that the download speed is not relevant--regardless of whether the download is 20 ms or 20,000 ms, the deserialization is quite slow. I will try it on Blazor 5 preview 8 soon.

@szalapski OK thanks for clarifying on the download speed. Hopefully you will see a large improvement on Blazor 5.

@tareqimbasher are you running on Blazor 5?

As discussed in https://github.com/dotnet/runtime/discussions/40318, due to current Blazor architecture, there is an expected perf hit that has a wide range depending on the exact scenarios, but ~35x slower than Core is a rough number that is in line with expectations.

However, there are a couple areas known to be slow that could be made faster in the serializer. These include large strings (say > 1K) and using System.Object or a non-generic collection such as IList (where elements are System.Object) as a property type.

szalapski commented 4 years ago

I see total time including serialization to get thousands of weather forecast lines cut in half when using .NET 5.0.0-rc1 in release configuration. It took around 13 seconds to get 53,000 weather forecasts in v 3.1 but 7 seconds in 5.0.0-rc1. Good improvement.

My controller in this example is returning an IEnumerable<WeatherForecast> which is just a WeatherForecast[] (ordinary array) underneath. The client deserializes that using HttpClient.GetFromJsonAsync<WeatherForecast[]>(string). Is there any certain types or techniques that could speed this up? I'm already avoiding non-generic lists and objects that are of type System.Object. Any other tips? I don't imagine there's any difference between using array and List, etc. but thought I'd ask--anything else we could do to "help" the deserializer along for collections with thousands of items?

I hope still that it can start approaching the performance of .NET in a console app.

akoeplinger commented 4 years ago

Doesn't solve the issue but from https://twitter.com/JamesNK/status/1310875638585204738 it looks like gRPC is a lot faster to deserialize:

I wrote a Blazor WebAssembly app that shows the performance benefits of gRPC-Web compared to JSON.

Both are fast with small payloads. With large data sets: • gRPC network usage is 70% smaller • gRPC deserialization is 10 times faster

Check it out here: https://grpcblazorperf.azurewebsites.net

szalapski commented 3 years ago

I will consider gRPC but it's not my preferred way to fix this. Thanks, though.

szalapski commented 3 years ago

We're finding ways to manage things, but it does seem like there ought to be a way to get 50,000 small objects deserialized in a second or two. I recommend setting a reasonable goal for the next release. :)

saulyt commented 3 years ago

I've been having similar issues. I've found Utf8Json to be much faster than both Newtonsoft and System.Text.Json. I am using a PipeReader and Utf8Json is still faster even though I have to copy the bytes to an array to deserialize while with STJ it can be read directly.

I am having issues with other things being slow as well, and I suspect this issue not strictly related to deserialization. Perhaps it is an issue with the way memory is accessed. For example, when I try to create an Excel file using EPPlus, ClosedXML, or similar APIs (I tried a bunch), it takes well over a minute for a 2MB file. Running the same exact code on Blazor server produces the file in about a second.

rajeshaz09 commented 3 years ago

@tareqimbasher @szalapski 2MB json file is taking about 7 seconds to deserialize which is not acceptable. We moved to MessagePack . It is taking 2 seconds.

steveharter commented 3 years ago

@rajeshaz09 I assume you've measured against 5.0 .NET since there have been gains.

I see that MessagePack claims to be ~13x faster deserializing than Json.NET (no benchmark for STJ) for the case of "large array of simple objects". So if STJ is 2x as fast as Json.NET here, the 7 seconds for STJ vs. 2 seconds for MessagePack seems consistent, although note that the benchmark is for standard .NET Core not under Blazor.

rajeshaz09 commented 3 years ago

@steveharter Thanks for the reply. Yes I am using .NET 5. Difference between STJ and MessagePack is more visible on low config machine.

I have powerful dev machine. I didn't see much difference (Hardly 1 second) between STJ and MessagePack with HighPerformance power setting. But I can see significant gap if I use Balanced/PowerSaver setting.

MessagePack is temporary solution, once we satisfy with .NET 6, we will move back to JSON.

lambrech commented 3 years ago

I am not 100% sure but it seems very likely that this is related: I just tried to deserialize a 2.6MB json containing 10.000 simple POCOs. In chrome the deserialization took took ~4 seconds (that's actually "good enough" for me, at least right now) But in Firefox the same deserialization took ~35 seconds! That is a serious problem for me ...

FYI: I am using .NET 6 Preview 3 and System.Text.Json

sxotney commented 3 years ago

I have had a similar journey recently moving through different serialisers and finally arriving at Messagepack which has been good enough in interpreted WASM for current users. Performance Vs System.Text.Json is impressive

However, scope of our WASM app is definitely expanding and we have users looking to handle 100s of thousands of of objects to perform data manipulation/analysis in browser like Excel would chomp through on a normal desktop. Wait times for data loads of this size (they really aren't massive payloads delivered from the API) are at the point where it is difficult to satisfy users and Server Side Blazor is becoming the only option.

The Blazor/WASM community has generally always expressed that code runs at native speeds (until you learn that everything outside of the .net libraries is interpreted) and I had hoped AOT would make an enormous difference here, allowing Messagepack serialiser to run at native speed. Our initial benchmarks of rc1 are showing it to be slower in this area than interpreted mode.

Maybe it's my misunderstanding of how serialisation works - is it object construction in .Net itself being slow here and I shouldn't see any difference between AOT and interpreted builds? Either way, serialisation is painfully slow for what is really not that much data.

ikeough commented 3 years ago

First, and most importantly, thanks to the team working on Blazor and web assembly. We think this technology has a really bright future!

I'll add my support for @szalapski here. We have an .NET open source library that is used heavily in back end services run on AWS Lambda. We were excited with the possibility of running some of our code in our web application. Our initial attempts to compile and run web assembly from our library in .NET 6 preview 7 have been met with massive performance degradation.

I established a small benchmark that creates 1000 cubes using the library (the library is for creating 3d stuff with lots of Vector3 structs and Polygon), serializes them to JSON, then writes the resulting 3D model to glTF. I duplicated that code in a small Blazor app.

Running the Blazor code compiled using dotnet run -c release (non AOT) and viewing the console in Chrome shows:

00:00:07.2027000 for writing to gltf.

We found that AOT compilation (which takes nearly 15 minutes), increases the performance by 2x.

The benchmark containing the same code run on the desktop, shows the following for writing to gltf: Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
'Write all cubes to glb.' 105.8 ms 4.42 ms 2.31 ms 21000.0000 3000.0000 1000.0000 85.05 MB

It takes nearly 67x as long to run in web assembly. We have a similar performance degradation for serializing and deserializing JSON.

Some considerations as to what might be slow:

You can find our example Blazor project that has no UI but runs the wasm and reports to the console here: https://github.com/hypar-io/Elements/tree/wasm-perf/Elements.Wasm. You can find the corresponding benchmark WasmComparison here: https://github.com/hypar-io/Elements/tree/wasm-perf/Elements.Benchmarks

We're really excited for the effort to bring C# to web assembly and are happy to provide any further information necessary. It would be fantastic for these development efforts if there was a way to run a dotnet benchmark across the core CLR and web assembly to make an apples->apples comparison. For now we've had to build our own.

One more thing... This performance degradation is not everywhere. We can call methods in our library that do some pretty complicated geometry stuff and they run at near native speed. We have a couple of demos of interactive 3d geometry editing and display using Blazor wasm. It's just serialization and reading/writing bytes that seem to be a big issue. Also looping in @gytaco who is doing some amazing work using c#->web assembly for geometry stuff.

marek-safar commented 3 years ago

You can find the corresponding benchmark WasmComparison here: https://github.com/hypar-io/Elements/tree/wasm-perf/Elements.Benchmarks

@SamMonoRT could we add it to interpreter benchmarks?

90Kaboom commented 2 years ago

If you face issue with JSON serialization performence , before trying to solve by refatoring your code, please check performeance in another browser, Blazor work realy fast on Edge, Opera, Chrome, but performance in Firefox is realy wick - slowdown serialization more than 10 times.

sxotney commented 2 years ago

@90Kaboom

Serialisation is slow across all browsers for Mono .Net. If the performance of Blazor is slow in a particular browser, that's more likely a wasm implementation issue for the team that maintain that browser as opposed to a Blazor/Mono .Net issue.

Kevin-Lewis commented 2 years ago

I see this is being targeted for .NET7. Blazor WASM has been great for the most part but this performance issue is making it really difficult to view Blazor as a viable option for some of the more data intensive projects I have coming up. I'll give MessagePack a try since it seems people have had some success with that,

Chief4Master commented 2 years ago

Any new news or suggestions (@szalapski )? We have the exact same problem. So we can not realize our application with Blazor.

Cpt-Falcon commented 1 year ago

I just tried it with a 10MB json file and its unusably slow. 10MB isn't that much. Its tiny. Its taking over 2 minutes to load the initial page which doesn't make sense IMO. I'm using the best performance tricks too:

    Assembly powerAssembly = typeof(PowerService).Assembly;
    await using Stream? stream = powerAssembly.GetManifestResourceStream("PowerShared.PowerDocuments.json");

    ValueTask<XMLJsonWrapper?> powerXmlDocumentsTask = JsonSerializer.DeserializeAsync<XMLJsonWrapper>(stream, new JsonSerializerOptions()
    {
        DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
    });

Its so slow and takes so long I can't even run a performance profiler either. THe performance profiler just bombs out and gets stuck.

I'm having other problems too where external .net 7 DLLs take forever to load.

There needs to be a way to quickly and efficiently load datasets into blazor WASM.

This is on .NET 7 by the way

Cpt-Falcon commented 1 year ago

Meanwhile react with an embedded file of the same size takes like 10 ms.

marek-safar commented 1 year ago

/cc @lewing

jirisykora83 commented 1 year ago

Meanwhile react with an embedded file of the same size takes like 10 ms.

I have the exact same problem with even smaller (about 1.5MB uncompress) json it was ~absolutely unusable~ kind of slow. I even tried to write custom binary serialization & deserialization and it was better but still too slow to justify maintaining that code. I ended up using dotnet grpc. It was still slower than pure javascript (json parse) but it is the best current option in term of performance / maitainence cost.

I also try: https://github.com/salarcode/Bois Custom solution was based on: MemoryStream& BinaryWriter

Edit: Some number from my usecase (i7 12700K):

System.Text.Json: 237ms (size: 1278kb) Bois: 42ms (size: 501kb) Custom: 41ms (size: 441kb)

For context if I recreated tested model from in-memory original object (custom clone) it is about 5ms.

These are numbers on published version of app and it measure just Deserialization. I do not have the number for GRPC from deployed app) But in debug mode on localhost it was about the same with Bois/custom (70ms) so i expect in deployed app it will be about 40ms on my pc.

Don't have exact number from "normal" dotnet but it will be like few ms

pragmaeuge commented 1 year ago

Same issue with a 10Mb GeoJson file in .NET 7. This issue makes difficult to develop effective, WFS oriented, gis solutions with .NET Wasm.

geometrikal commented 1 year ago

Any progress on this? Running into same issue.

Webreaper commented 1 year ago

Apparently lots of performance improvements (as an indirect result of the new WASM JIT) in .Net 8 preview 2, so might be worth a look. I'd be trying it, but still no VS for Mac support....

geometrikal commented 1 year ago

@Webreaper Thanks, looking forward to .NET 8. Also I just compiled with AOT turned on in .NET 7 and the slowness disappeared so suggest trying that @pragmaeuge

Webreaper commented 1 year ago

The only thing you need to be careful of with AOT is this issue: https://github.com/dotnet/runtime/issues/62149

pragmaeuge commented 1 year ago

@Webreaper Thanks, looking forward to .NET 8. Also I just compiled with AOT turned on in .NET 7 and the slowness disappeared so suggest trying that @pragmaeuge

Hi! I will take a look at it soon, thanks a lot.

sofiageo commented 1 year ago

I deployed with .NET 8 preview 2 today and deserialization is much faster. Jiterpreter helps a lot, from about 4-5 seconds with .NET 7 to 1-2 seconds. The rest of the application works too which is nice :)