dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.01k stars 4.67k forks source link

System.Text.Json is excruciatingly slow #31493

Closed zachsaw closed 4 years ago

zachsaw commented 4 years ago

Tested it on macOS High Sierra with .net core 3.0.100.

Even running it with dotnet run -c Release (or without for Debug mode) with terminal shows that it runs 6x slower than Newtonsoft.Json!

I've created a repo to demonstrate the problem here - https://github.com/zachsawcs/SerializerBattle

khellang commented 4 years ago

Uuuh, in the other benchmarks, you're using a pre-allocated MemoryStream, while in the System.Text.Json case, you're always allocating a new buffer. Of course it's going to be slow 🤦‍♂

zachsaw commented 4 years ago

How do you use a pre-allocated MemoryStream for System.Text.Json?

khellang commented 4 years ago

Something like this?

diff --git a/Program.cs b/Program.cs
index 409fe19..1548f81 100644
--- a/Program.cs
+++ b/Program.cs
@@ -3,33 +3,33 @@ using System.Collections.Generic;
 using System.Diagnostics;
 using System.IO;
 using System.Text.Json;
-using System.Threading;
 using Binaron.Serializer;
 using Newtonsoft.Json;
 using JsonSerializer = Newtonsoft.Json.JsonSerializer;
 using CoreJsonSerializer = System.Text.Json.JsonSerializer;
+using System.Threading.Tasks;

 namespace SerializerBattle
 {
     public class Program
     {
         private static readonly Book Source = Book.Create();
-        private static readonly Action[] Tests = {BinaronTest, NewtonsoftJsonTest, NetCore3JsonTest};
+        private static readonly Func<Task>[] Tests = {BinaronTest, NewtonsoftJsonTest, NetCore3JsonTest};

-        public static void Main()
+        public static async Task Main()
         {
             const int loop = 15;

             // warm-up
             foreach (var test in Tests)
-                test();
+                await test();

-            var tester = new Thread(() => Tester(loop)) {Priority = ThreadPriority.Highest};
-            tester.Start();
-            tester.Join();
+            await Task.Yield();
+
+            await Tester(loop);
         }

-        private static void Tester(int loop)
+        private static async Task Tester(int loop)
         {
             foreach (var test in Tests)
             {
@@ -38,14 +38,14 @@ namespace SerializerBattle
                 sw.Start();

                 for (var j = 0; j < loop; j++)
-                    test();
+                    await test();

                 sw.Stop();
                 Console.WriteLine($"{test.Method.Name} {sw.ElapsedMilliseconds / loop} ms/op");
             }
         }

-        private static void BinaronTest()
+        private static Task BinaronTest()
         {
             using var stream = new MemoryStream();
             for (var i = 0; i < 50; i++)
@@ -56,9 +56,10 @@ namespace SerializerBattle
                 stream.Position = 0;
                 Trace.Assert(book.Title != null);
             }
+            return Task.CompletedTask;
         }

-        private static void NewtonsoftJsonTest()
+        private static Task NewtonsoftJsonTest()
         {
             var ser = new JsonSerializer {NullValueHandling = NullValueHandling.Ignore};
             using var stream = new MemoryStream();
@@ -78,14 +79,19 @@ namespace SerializerBattle
                 }
                 stream.Position = 0;
             }
+            return Task.CompletedTask;
         }

-        private static void NetCore3JsonTest()
+        private static async Task NetCore3JsonTest()
         {
+            var options = new JsonSerializerOptions { IgnoreNullValues = true };
+            using var stream = new MemoryStream();
             for (var i = 0; i < 50; i++)
             {
-                var bytes = CoreJsonSerializer.SerializeToUtf8Bytes(Source, new JsonSerializerOptions {IgnoreNullValues = true});
-                var book = CoreJsonSerializer.Deserialize<Book>(bytes);
+                await CoreJsonSerializer.SerializeAsync(stream, Source, options);
+                stream.Position = 0;
+                var book = await CoreJsonSerializer.DeserializeAsync<Book>(stream, options);
+                stream.Position = 0;
                 Trace.Assert(book.Title != null);
             }
         }

I just tried running it quickly in Debug and got the following results:

BinaronTest 98 ms/op
NewtonsoftJsonTest 297 ms/op
NetCore3JsonTest 298 ms/op
zachsaw commented 4 years ago

Ok updated the code to use MemoryStream - no difference in performance

zachsaw commented 4 years ago

Did you run it on a macOS High Sierra?

khellang commented 4 years ago

Did you run it on a macOS High Sierra?

Nope, Windows 10 1803. Do you think this is down to the OS?

zachsaw commented 4 years ago

Updated the codebase again in the repo with your changes.

Here's what I now get with dotnet run -c Release

It is indeed a bit faster:

Running BinaronTest
BinaronTest 83 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 242 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 1276 ms/op
zachsaw commented 4 years ago

Let me fire up a Windows VM to see if it's any faster.

khellang commented 4 years ago

Running the old code on my Windows machine gave the following results:

BinaronTest 79 ms/op
NewtonsoftJsonTest 280 ms/op
NetCore3JsonTest 563 ms/op
zachsaw commented 4 years ago

Turns out it is indeed due to the OS difference.

On a Windows VM, I get the following results:

Running BinaronTest
BinaronTest 83 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 240 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 199 ms/op
zachsaw commented 4 years ago

On Ubuntu 64-bit 18.04.3, this is what I got:

Running BinaronTest
BinaronTest 86 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 260 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 931 ms/op

Ubuntu is the slowest OS of all when it comes to the benchmark but System.Text.Json is also painfully slow there.

Looks like it's as fast as Newtonsoft on Windows only?

khellang commented 4 years ago

Looks like it's as fast as Newtonsoft on Windows only?

Hmm. I would've expected the serializer benchmarks to highlight this... @adamsitnik?

vcsjones commented 4 years ago

With Ubuntu 19.04 on an AMD CPU I get, in release builds:

Running BinaronTest
BinaronTest 73 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 255 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 222 ms/op

In general I'm somewhat skeptical of "roll your own" benchmarks, I would be interested to see this in terms of BenchmarkDotNet.

ahsonkhan commented 4 years ago

It is awesome to see some the benchmarking and results being shared around JSON. I also appreciate folks running them in different environments, sharing reports/bugs, so we can fix any perf issues (especially if it's something we overlooked - e.g. see https://github.com/dotnet/corefx/issues/41638). We also have quite a few ideas to continuously improve perf and features: https://github.com/dotnet/corefx/issues?q=is%3Aissue+label%3Aarea-System.Text.Json+label%3Atenet-performance+is%3Aopen (in fact, 3.1 is faster than 3.0). Generally speaking, there isn't much within the JSON library that is platform specific so I wouldn't expect much perf difference between OSes.

However, I would advise against such strong claims until the benchmarks are accurate and the results reviewed/verified (and would love an approach of collaboration and improvement). Imo, the stronger the claim, the higher the burden of proof and due diligence. I am referring to the following in the README:

Surprisingly the new .net core 3.0 JSON serializer ends up being the slowest by miles!

Not sure what is wrong with the new .net core JSON serializer. No matter how you run it (even with just dotnet run), it's always excruciatingly slow.

As such, I would strongly recommend people to avoid System.Text.Json.JsonSerializer until it is mature enough.

We have quite a few benchmarks already which show the performance benefits, including in larger/end-to-end benchmarking involving Asp.Net. See System.Text.Json in ASP.NET Core MVC

I cloned your repo and ran it on macOS (High Sierra - 10.13.3 - 2.5 GHz Intel Core i7), but couldn't repro the results you shared (in fact, S.T.Json shows perf improvements):

Running BinaronTest
BinaronTest 73 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 207 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 188 ms/op

If you do see a scenario or test case that is verifiably slower (or too slow), please continue to file issues with the specific repro. Thanks!

zachsaw commented 4 years ago

I'll test this on a newer hardware architecture and see if I could get the same behaviour later today. Ubuntu 18.04 and macOS both had the same slowness to a different degree. And no matter how many times I run it, I get the same results within reasonable margin of error.

What do you suggest we do to investigate this further?

vcsjones commented 4 years ago

What do you suggest we do to investigate this further?

I would be curious to know what specifically in your benchmark is slow by timing smaller components. Right now both serialize and de-serialize is covered in the bench mark, as is allocating a MemoryStream and its underlying memory.

In your very slow test cases, is it just serialization that is slow? Just deserialization? Both? Creating the memory stream?

zachsaw commented 4 years ago

MemoryStream is created just once for each 50 round-trip tests. I'll add a separate stopwatch to test the smaller components.

zachsaw commented 4 years ago

Interestingly, running it on the work laptop gives me the following results which is inline with what you and others observed.

Running BinaronTest
BinaronTest 66 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 204 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 182 ms/op
zachsaw commented 4 years ago

Updated the repo to breakdown Serialize/Deserialize time per loop, ran it over and over again and the results are consistent as follow.

Serialize: 147 ms, Deserialize: 1267 ms
Running NetCore3JsonTest
Serialize: 91 ms, Deserialize: 1203 ms
Serialize: 90 ms, Deserialize: 1206 ms
Serialize: 88 ms, Deserialize: 1206 ms
Serialize: 88 ms, Deserialize: 1192 ms
Serialize: 87 ms, Deserialize: 1201 ms
Serialize: 87 ms, Deserialize: 1202 ms
Serialize: 87 ms, Deserialize: 1201 ms
Serialize: 86 ms, Deserialize: 1195 ms
Serialize: 88 ms, Deserialize: 1192 ms
Serialize: 86 ms, Deserialize: 1194 ms
Serialize: 88 ms, Deserialize: 1187 ms
Serialize: 87 ms, Deserialize: 1193 ms
Serialize: 87 ms, Deserialize: 1190 ms
Serialize: 87 ms, Deserialize: 1198 ms
Serialize: 86 ms, Deserialize: 1197 ms
NetCore3JsonTest 1285 ms/op
Running BinaronTest
BinaronTest 72 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 237 ms/op

Serialize: 148 ms, Deserialize: 1274 ms
Running NetCore3JsonTest
Serialize: 88 ms, Deserialize: 1220 ms
Serialize: 90 ms, Deserialize: 1232 ms
Serialize: 86 ms, Deserialize: 1213 ms
Serialize: 87 ms, Deserialize: 1211 ms
Serialize: 85 ms, Deserialize: 1210 ms
Serialize: 86 ms, Deserialize: 1212 ms
Serialize: 84 ms, Deserialize: 1212 ms
Serialize: 85 ms, Deserialize: 1209 ms
Serialize: 84 ms, Deserialize: 1216 ms
Serialize: 87 ms, Deserialize: 1217 ms
Serialize: 87 ms, Deserialize: 1209 ms
Serialize: 87 ms, Deserialize: 1208 ms
Serialize: 86 ms, Deserialize: 1222 ms
Serialize: 86 ms, Deserialize: 1214 ms
Serialize: 84 ms, Deserialize: 1230 ms
NetCore3JsonTest 1302 ms/op
Running BinaronTest
BinaronTest 72 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 241 ms/op

At this point I'm wondering if it's because of the CPU lacking AVX2 instruction sets...? My work laptop which has the 8th gen core i7 works fine. Serialization works well - Deserialization is the problem on 3rd gen i5 with macOS/Ubuntu.

zachsaw commented 4 years ago

Please ignore the BinaronTest showing 72 ms/op now instead of ~83 ms/op. In case anyone's wondering, it's just due to an updated version of the serializer giving 15% perf increase.

khellang commented 4 years ago

At this point I'm wondering if it's because of the CPU lacking AVX2 instruction sets...? My work laptop which has the 8th gen core i7 works fine.

You could try to run it on your work laptop with COMPlus_EnableAVX2=0 set?

zachsaw commented 4 years ago

It's Friday night where I am and I left the work laptop at work. Won't be able to test until Monday - any chance you can try it on Ubuntu on your side with that env var?

zachsaw commented 4 years ago

OK this is getting really strange. I downloaded Rider and ran it within the IDE - guess what?

Serialize: 174 ms, Deserialize: 212 ms
Running NetCore3JsonTest
Serialize: 86 ms, Deserialize: 121 ms
Serialize: 89 ms, Deserialize: 117 ms
Serialize: 85 ms, Deserialize: 116 ms
Serialize: 85 ms, Deserialize: 119 ms
Serialize: 85 ms, Deserialize: 116 ms
Serialize: 86 ms, Deserialize: 119 ms
Serialize: 86 ms, Deserialize: 115 ms
Serialize: 86 ms, Deserialize: 119 ms
Serialize: 85 ms, Deserialize: 116 ms
Serialize: 87 ms, Deserialize: 116 ms
Serialize: 85 ms, Deserialize: 117 ms
Serialize: 85 ms, Deserialize: 116 ms
Serialize: 85 ms, Deserialize: 120 ms
Serialize: 85 ms, Deserialize: 120 ms
Serialize: 84 ms, Deserialize: 118 ms
NetCore3JsonTest 204 ms/op
Running BinaronTest
BinaronTest 72 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 248 ms/op

Works fine!

What gives? Copy and paste Rider's dotnet cmd line to terminal (/usr/local/share/dotnet/dotnet /Users/----------/SerializerBattle/bin/Release/netcoreapp3.0/SerializerBattle.dll) gives me the bad result again...!

steveharter commented 4 years ago

Some thoughts:

zachsaw commented 4 years ago

Some thoughts:

* Be sure running in release mode

They're all in release mode.

* Can you share non-async results

How do you to do non-async stream serialize/deserialize?

* Try setting the default buffer size on JsonSerializerOptions to a size large enough to hold your entire Stream

Just trying to understand your hypothesis here - do you think running under Rider changes JsonSerializerOptions?

steveharter commented 4 years ago

How do you to do non-async stream serialize/deserialize?

I was suggesting to just try the non-async\Stream APIs to see if they are also slow in your environment (in order to help troubleshoot).

Just trying to understand your hypothesis here - do you think running under Rider changes JsonSerializerOptions?

No. The default is 16K. The deserializer attempts to re-use the same buffer (i.e. does not "drain" the whole Stream ahead of time, unlike other deserializers). If the deserializer encounters a very large property (or an object or collection that uses a custom converter) then it will create a new buffer to contain the value (say 32K) and the data from the previous buffer is copied to the newer large buffer. If that is still not big enough, then another one is created (say 64K) etc. This can be inefficient if you know ahead of time what the largest size is for a given property.

ahsonkhan commented 4 years ago

Also, given we can't really repro the results you are seeing reliably (including yourself), trying to figure out what could be causing this one-off slow perf is difficult, so narrowing things down might help (for instance, is involving IO in some way causing this issue).

So, in the environment that you saw slower perf, run just the S.T.Json benchmark (in isolation to see if that changes something), run serialization/deserialization to utf8 byte array instead of stream (or serialize into a Utf8JsonWriter).

The last thing would be to try and profile the app to see what could be causing the regression and share the traces. @adamsitnik can help with that (depending on the OS).

@zachsaw - can you summarize and clarify what steps you are taking to repro the slow behavior and what environment (specifically which OS, physical or virtual machine, etc.)? A bunch of trials and OSes are mentioned in the thread, so can you pinpoint exactly what your environment is where you saw the slower behavior? That would also help to get others to try and see if what you are seeing can be reproduced somehow.

adamsitnik commented 4 years ago

The last thing would be to try and profile the app to see what could be causing the regression and share the traces. @adamsitnik can help with that (depending on the OS).

On macOs you can use dotnet trace. On Linux dotnet trace, PerfCollect or VTune. Here are the docs that describe how to profile: https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-corefx.md

And here you can read more about micro benchmark design guidelines: https://github.com/dotnet/performance/blob/master/docs/microbenchmark-design-guidelines.md

zachsaw commented 4 years ago

I've managed to do more testing with .net core 3.0.101 (as opposed to 3.0.100).

Running a terminal window /bin/bash under Rider IDE gives me the following.

Running NetCore3JsonTest
NetCore3JsonTest 270 ms/op
Running BinaronTest
BinaronTest 73 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 256 ms/op

It would appear that .net core 3.0.101 has a performance regression for S.T.Json.

Running under the OS terminal is still just as slow as .100 for deserialize. The OS terminal must be doing something special - the same /bin/bash under Rider doesn't have the problem. Any ideas what could cause it to run slower by that much under the OS terminal window? Even tried running it in the background (&) but to no effect on performance.

zachsaw commented 4 years ago

MacOS High Sierra Apple iMac "Core i5" 3.2 27-Inch (Late 2012) with 32GB RAM https://everymac.com/systems/apple/imac/specs/imac-core-i5-3.2-27-inch-aluminum-late-2012-specs.html

All benchmarks were run in release mode, with the ones under mac on physical machine. Windows and Ubuntu benchmarks were run under VMWare Fusion, also in release mode.

Can rule out swap file being active or background processes running since I had both Rider and OS term window open, testing one after another multiple times.

ahsonkhan commented 4 years ago

I've managed to do more testing with .net core 3.0.101 (as opposed to 3.0.100).

Can you just run the deserializer component of the test for .NET Core within the OS terminal and share the performance results of that (compared to running it within the IDE)?

It would appear that .net core 3.0.101 has a performance regression for S.T.Json.

Which component did you observe regress? Serialize or Deserialize? Or was it both? Are you seeing the regression using benchmark dot net (BDN)?

Regarding the initial OS/environment specific (6x) regression you saw that I can't reproduce, outside of capturing a perf trace, I am not sure how else to make forward progress on this to figure out where that's coming from.

Can you modify and run your benchmark using BDN? I would be curious to see if you can reproduce the results using that.

zachsawcs commented 4 years ago

The performance regression of 3.0.101 is definitely reproducible - I've tested it on multiple VMs and physical machines and they all exhibit the same problem.

Serialization is now slower by a whopping 40%!

Interesting to note that the performances of the other two in the benchmark don't get affected by the change in .net core version.

zachsaw commented 4 years ago

BDN results confirmed my benchmark results to be accurate (OS terminal):

// * Detailed results *
Benchmark.NetCore3Json: DefaultJob
Runtime = .NET Core 3.0.1 (CoreCLR 4.700.19.51502, CoreFX 4.700.19.51609), X64 RyuJIT; GC = Concurrent Workstation
Mean = 1.2830 s, StdErr = 0.0012 s (0.09%); N = 14, StdDev = 0.0044 s
Min = 1.2720 s, Q1 = 1.2821 s, Median = 1.2829 s, Q3 = 1.2841 s, Max = 1.2904 s
IQR = 0.0021 s, LowerFence = 1.2789 s, UpperFence = 1.2872 s
ConfidenceInterval = [1.2780 s; 1.2879 s] (CI 99.9%), Margin = 0.0050 s (0.39% of Mean)
Skewness = -0.54, Kurtosis = 3.66, MValue = 2
-------------------- Histogram --------------------
[1.270 s ; 1.292 s) | @@@@@@@@@@@@@@
---------------------------------------------------

Benchmark.Binaron: DefaultJob
Runtime = .NET Core 3.0.1 (CoreCLR 4.700.19.51502, CoreFX 4.700.19.51609), X64 RyuJIT; GC = Concurrent Workstation
Mean = 71.2277 ms, StdErr = 0.1062 ms (0.15%); N = 13, StdDev = 0.3828 ms
Min = 70.7838 ms, Q1 = 70.9601 ms, Median = 71.1056 ms, Q3 = 71.4236 ms, Max = 72.2687 ms
IQR = 0.4635 ms, LowerFence = 70.2648 ms, UpperFence = 72.1189 ms
ConfidenceInterval = [70.7692 ms; 71.6861 ms] (CI 99.9%), Margin = 0.4584 ms (0.64% of Mean)
Skewness = 1.38, Kurtosis = 4.46, MValue = 2
-------------------- Histogram --------------------
[70.641 ms ; 72.411 ms) | @@@@@@@@@@@@@
---------------------------------------------------

Benchmark.NewtonsoftJson: DefaultJob
Runtime = .NET Core 3.0.1 (CoreCLR 4.700.19.51502, CoreFX 4.700.19.51609), X64 RyuJIT; GC = Concurrent Workstation
Mean = 235.2109 ms, StdErr = 0.1315 ms (0.06%); N = 13, StdDev = 0.4742 ms
Min = 234.5811 ms, Q1 = 234.8075 ms, Median = 235.0847 ms, Q3 = 235.7098 ms, Max = 235.9564 ms
IQR = 0.9023 ms, LowerFence = 233.4541 ms, UpperFence = 237.0632 ms
ConfidenceInterval = [234.6430 ms; 235.7787 ms] (CI 99.9%), Margin = 0.5679 ms (0.24% of Mean)
Skewness = 0.26, Kurtosis = 1.4, MValue = 2
-------------------- Histogram --------------------
[234.405 ms ; 236.133 ms) | @@@@@@@@@@@@@
---------------------------------------------------

// * Summary *

BenchmarkDotNet=v0.12.0, OS=macOS High Sierra 10.13.6 (17G9016) [Darwin 17.7.0]
Intel Core i5-3470 CPU 3.20GHz (Max: 3.21GHz) (Ivy Bridge), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=3.0.101
  [Host]     : .NET Core 3.0.1 (CoreCLR 4.700.19.51502, CoreFX 4.700.19.51609), X64 RyuJIT
  DefaultJob : .NET Core 3.0.1 (CoreCLR 4.700.19.51502, CoreFX 4.700.19.51609), X64 RyuJIT

|         Method |        Mean |    Error |   StdDev |
|--------------- |------------:|---------:|---------:|
|   NetCore3Json | 1,282.96 ms | 4.990 ms | 4.423 ms |
|        Binaron |    71.23 ms | 0.458 ms | 0.383 ms |
| NewtonsoftJson |   235.21 ms | 0.568 ms | 0.474 ms |

// * Hints *
Outliers
  Benchmark.NetCore3Json: Default   -> 1 outlier  was  removed, 2 outliers were detected (1.27 s, 1.29 s)
  Benchmark.Binaron: Default        -> 2 outliers were removed (72.45 ms, 73.15 ms)
  Benchmark.NewtonsoftJson: Default -> 2 outliers were removed (237.51 ms, 238.33 ms)

// * Legends *
  Mean   : Arithmetic mean of all measurements
  Error  : Half of 99.9% confidence interval
  StdDev : Standard deviation of all measurements
  1 ms   : 1 Millisecond (0.001 sec)

// ***** BenchmarkRunner: End *****
// ** Remained 0 benchmark(s) to run **
Run time: 00:00:51 (51.26 sec), executed benchmarks: 3

Global total time: 00:00:56 (56.27 sec), executed benchmarks: 3
zachsaw commented 4 years ago

This is getting weirder: Running S.T.Json under BDN inside Rider shows no performance regression. Take away BDN and the problem occurs.

The modified code now looks like this:

#define BDN

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Text.Json;
using System.Threading.Tasks;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Binaron.Serializer;
using Newtonsoft.Json;
using JsonSerializer = Newtonsoft.Json.JsonSerializer;
using CoreJsonSerializer = System.Text.Json.JsonSerializer;

namespace SerializerBattle
{
    public class Program
    {
        private static readonly Book Source = Book.Create();
        private static readonly Func<Task>[] Tests = {NetCore3JsonTest, BinaronTest, NewtonsoftJsonTest};

        public static async Task Main()
        {
#if BDN
            var summary = BenchmarkRunner.Run<Benchmark>();
#else
            const int loop = 15;

            // warm-up
            foreach (var test in Tests)
                await test();

            await Task.Yield();
            await Tester(loop);
#endif
        }

        public class Benchmark
        {
            [Benchmark]
            public async Task NetCore3Json() => await NetCore3JsonTest();

            [Benchmark]
            public async Task Binaron() => await BinaronTest();

            [Benchmark]
            public async Task NewtonsoftJson() => await NewtonsoftJsonTest();
        }

        private static async Task Tester(int loop)
        {
            foreach (var test in Tests)
            {
                Console.WriteLine($"Running {test.Method.Name}");
                var sw = new Stopwatch();
                sw.Start();

                for (var j = 0; j < loop; j++)
                    await test();

                sw.Stop();
                Console.WriteLine($"{test.Method.Name} {sw.ElapsedMilliseconds / loop} ms/op");
            }
        }

        private static Task BinaronTest()
        {
            var serializerOptions = new SerializerOptions {SkipNullValues = true};
            using var stream = new MemoryStream();
            for (var i = 0; i < 50; i++)
            {
                BinaronConvert.Serialize(Source, stream, serializerOptions);
                stream.Position = 0;
                var book = BinaronConvert.Deserialize<Book>(stream);
                stream.Position = 0;
                Trace.Assert(book.Title != null);
            }

            return Task.CompletedTask;
        }

        private static Task NewtonsoftJsonTest()
        {
            var ser = new JsonSerializer {NullValueHandling = NullValueHandling.Ignore};
            using var stream = new MemoryStream();
            for (var i = 0; i < 50; i++)
            {
                {
                    using var writer = new StreamWriter(stream, leaveOpen: true);
                    using var jsonWriter = new JsonTextWriter(writer);
                    ser.Serialize(jsonWriter, Source);
                }
                stream.Position = 0;
                {
                    using var reader = new StreamReader(stream, leaveOpen: true);
                    using var jsonReader = new JsonTextReader(reader);
                    var book = ser.Deserialize<Book>(jsonReader);
                    Trace.Assert(book.Title != null);
                }
                stream.Position = 0;
            }

            return Task.CompletedTask;
        }

        private static async Task NetCore3JsonTest()
        {
            var jsonSerializerOptions = new JsonSerializerOptions {IgnoreNullValues = true};
            await using var stream = new MemoryStream();
            for (var i = 0; i < 50; i++)
            {
                await CoreJsonSerializer.SerializeAsync(stream, Source, jsonSerializerOptions);
                stream.Position = 0;
                var book = await CoreJsonSerializer.DeserializeAsync<Book>(stream);
                stream.Position = 0;
                Trace.Assert(book.Title != null);
            }
        }

        public class Book
        {
            public int Id { get; set; }

            public string Title { get; set; }
            public int AuthorId { get; set; }
            public List<Page> Pages { get; set; }
            public DateTime? Published { get; set; }
            public byte[] Cover { get; set; }
            public HashSet<DateTime> Changes { get; set; }
            public Dictionary<string, object> Metadata { get; set; }
            public Genre[] Genres { get; set; }
            public double[] MeanRankings { get; set; }

            public static Book Create()
            {
                var book = new Book
                {
                    Id = 10,
                    AuthorId = 234,
                    Changes = new HashSet<DateTime> {DateTime.Now, DateTime.Now - TimeSpan.FromDays(1), DateTime.Now - TimeSpan.FromDays(100)},
                    Title = "Blah blah",
                    Genres = new[] {Genre.Action, Genre.Comedy},
                    Pages = new List<Page>(),
                    Cover = new[] {(byte) 123, (byte) 1, (byte) 2, (byte) 3, (byte) 66},
                    Metadata = new Dictionary<string, object>
                    {
                        {"Matrix", "Neo"},
                        {"SEO", "/questions/8157636/can-json-net-serialize-deserialize-to-from-a-stream"},
                        {"QuestionTag", "C#"},
                        {"Answers", "C# Type system is not very well designed"},
                        {"Number of answers", 3},
                        {
                            "Comment",
                            "Binaron.Serializer is a serializer for modern programming languages. First class languages will be able to support the Binary Object Notation afforded by this library."
                        }
                    },
                    MeanRankings = new double[50]
                };

                var rnd = new Random(5);
                for (var i = 0; i < 150; i++)
                    book.Pages.Add(CreateNewPage(rnd));
                for (var i = 0; i < book.MeanRankings.Length; i++)
                    book.MeanRankings[i] = rnd.NextDouble();
                return book;
            }

            private static Page CreateNewPage(Random rnd)
            {
                return new Page
                {
                    Identity = Guid.NewGuid(),
                    Text =
                        "This is pretty amazing, thank you. It allowed me to create a generic interface for calling into a bunch of pre-existing methods each written for specific types, that could not (or with great difficulty at least) be re-written generically. It was starting to look like I would have to do some horrible if (type == typeof(int)) and then cast back to the generic type w/ extra boxing / unboxing return (T)(object)result;(because the type is only logically known, not statically known)",
                    Notes = new List<Notes>
                    {
                        new Notes
                        {
                            Footnote = new Footnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a footer of a book on a page"},
                            Headnote = new Headnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a header of a book on a page"}
                        },
                        new Notes
                        {
                            Footnote = new Footnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a footer of a book on a page"},
                            Headnote = new Headnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a header of a book on a page"}
                        },
                        new Notes
                        {
                            Footnote = new Footnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a footer of a book on a page"},
                            Headnote = new Headnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a header of a book on a page"}
                        },
                        new Notes
                        {
                            Footnote = new Footnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a footer of a book on a page"},
                            Headnote = new Headnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a header of a book on a page"}
                        },
                        new Notes
                        {
                            Footnote = new Footnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a footer of a book on a page"},
                            Headnote = new Headnote {Note = $"{rnd.Next(int.MinValue, int.MaxValue)} This is a header of a book on a page"}
                        }
                    }
                };
            }
        }

        public enum Genre
        {
            Action,
            Romance,
            Comedy,
            SciFi
        }

        public class Page
        {
            public string Text { get; set; }
            public List<Notes> Notes { get; set; }
            public Guid Identity { get; set; }
        }

        public class Footnote
        {
            public string Note { get; set; }
            public string WrittenBy { get; set; }
            public DateTime CreateadAt { get; set; }
            public long Index { get; set; }
        }

        public class Headnote
        {
            public string Note { get; set; }
            public string WrittenBy { get; set; }
            public DateTime? ModifiedAt { get; set; }
        }

        public class Notes
        {
            public Headnote Headnote { get; set; }
            public Footnote Footnote { get; set; }
        }
    }
}
zachsawcs commented 4 years ago

To help others repro this issue, I installed the latest macOS Mojave on a VM (VMware via workaround to support macOS). This is just out of curiosity to see if the issue occurs on a modern CPU (8th gen i7) + macOS inside a VM (Win10 host). Again, this was tested with dotnet run -c Release in the macOS terminal window.

Running BinaronTest
BinaronTest 62 ms/op
Running NewtonsoftJsonTest
NewtonsoftJsonTest 206 ms/op
Running NetCore3JsonTest
NetCore3JsonTest 1214 ms/op
ahsonkhan commented 4 years ago

Can you share your published executable? And can you split up the serialization and deserialization calls and measure them separately?

Also, did you get a chance to capture a perf trace to see where all that time was being spent?

zachsawcs commented 4 years ago

I did a dotnet run -c Release - I didn't publish it. I'll do another run with serialization and deserialization split but I don't expect that to be different to the result already captured in a physical macOS with 3rd gen i5.

Did not do a perf trace yet. It's too big an undertaking for the time I have to take this issue further.

ahsonkhan commented 4 years ago

I didn't publish it.

Can you please publish a standalone runnable app and share it so I can try again to run the exact bits on my Mac?

I have tried building your perf test myself on Windows/Mac physical machines and can't repro the numbers you are seeing.

zachsawcs commented 4 years ago

Can you try testing with it running on mojave under VMWare? I'll try to compile a standalone later when I get some time.

ahsonkhan commented 4 years ago

I'll try to compile a standalone later when I get some time.

@zachsawcs / @zachsaw did you get a chance to create and share the standalone app that shows the performance issue?

zachsaw commented 4 years ago

https://www.dropbox.com/s/n7fsxns58a2zqz2/SerializerBattle.zip?dl=0

layomia commented 4 years ago

We've made various performance improvements in 5.0 - https://github.com/dotnet/runtime/pulls?q=is%3Apr+label%3Atenet-performance+label%3Aarea-System.Text.Json+is%3Aclosed. Please feel to reopen if you encounter a reproducible performance issue.

zachsaw commented 4 years ago

Just to confirm this is now fixed on 5.0. Tested with preview 8. System.Text.Json is still the slower than Newtonsoft, but at least it is no longer > 5x slower.

Just out of curiosity - what was the problem with .net 3.0 and .net 3.1?

NetCore.Json Test
=================
Warm-up
Running
Result: 29.373 ms/op

Newtonsoft.Json Test
====================
Warm-up
Running
Result: 28.373 ms/op

Binaron.Serializer Test
=======================
Warm-up
Running
Result: 6.827 ms/op
zachsaw commented 4 years ago

Interestingly, 5.0 seems to have improved the performance of Binaron.Serializer by 15% in this particular test whereas Newtonsoft.Json remained the same. Loving .net 5.0 already!

steveharter commented 4 years ago

Just to confirm this is now fixed on 5.0. Tested with preview 8. System.Text.Json is still the slower than Newtonsoft, but at least it is no longer > 5x slower.

@zachsaw I decompiled your tests and the likely reason for the slowness is due to using polymorphism via Dictionary<string, object> for the "Book.Metadata" property. Unlike Newtonsoft, STJ doesn't "guess" at what the object type maps to and instead creates a JsonElement to represent. This is slow because there is an additional alloc per object instance as well as other overhead of JsonElement. This was made faster in 3.1 and any future optimizations are currently this is being tracked in https://github.com/dotnet/runtime/issues/38637.

The "work-around" if you want to call it that is to use a strongly-typed dictionary such as Dictionary<string, string> or create a custom converter that emulates Newtonsoft's behavior. In a simple local test, adding a single polymorphic dictionary similar to you have it makes performance almost ~2x as slow in STJ.