go-json-experiment / json

Experimental implementation of a proposed v2 encoding/json package
BSD 3-Clause "New" or "Revised" License
341 stars 11 forks source link

Some questions about benchmarks #6

Open AsterDY opened 2 years ago

AsterDY commented 2 years ago

I want to check the benchmark codes in regards to https://github.com/go-json-experiment/json#performance, but the repo (https://github.com/dsnet/jsonbench) you mentioned is missing. Where can I find the codes? BTW, you mentioned SonicJSON doesn't support sorting the keys for a map[string]any. In fact, there is an option encoder.SortKeys or sonic.Config.SortKeysto support this —— so maybe you can use it on your benchmarks.

AsterDY commented 2 years ago

Actually, I used your bench's data and structures to test these libs, and the results show the performance of sonic is better than your pictures look:

goos: linux
goarch: amd64
pkg: github.com/bytedance/sonic/generic_test
cpu: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
BenchmarkUnmarshalConcrete
BenchmarkUnmarshalConcrete/CanadaGeometry_Std
BenchmarkUnmarshalConcrete/CanadaGeometry_Std-32                      219           5375877 ns/op          50.30 MB/s          778362 B/op           16080 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_StdV2
BenchmarkUnmarshalConcrete/CanadaGeometry_StdV2-32                    274           4274500 ns/op          63.26 MB/s          748171 B/op           17562 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_Sonic
BenchmarkUnmarshalConcrete/CanadaGeometry_Sonic-32                   1245            858658 ns/op         314.91 MB/s          675172 B/op             567 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_SonicStd
BenchmarkUnmarshalConcrete/CanadaGeometry_SonicStd-32                1288            857684 ns/op         315.27 MB/s          681274 B/op             571 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_GoJson
BenchmarkUnmarshalConcrete/CanadaGeometry_GoJson-32                   568           2002579 ns/op         135.03 MB/s          536506 B/op            8119 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_JsonIter
BenchmarkUnmarshalConcrete/CanadaGeometry_JsonIter-32                 304           3870887 ns/op          69.86 MB/s          979172 B/op           26751 allocs/op
BenchmarkUnmarshalConcrete/CanadaGeometry_JsonIterStd
BenchmarkUnmarshalConcrete/CanadaGeometry_JsonIterStd-32              304           3798962 ns/op          71.18 MB/s          979173 B/op           26751 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_Std
BenchmarkUnmarshalConcrete/CitmCatalog_Std-32                          61          20887931 ns/op          82.69 MB/s         1600162 B/op           33323 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_StdV2
BenchmarkUnmarshalConcrete/CitmCatalog_StdV2-32                       160           7873757 ns/op         219.36 MB/s         1158364 B/op           13926 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_Sonic
BenchmarkUnmarshalConcrete/CitmCatalog_Sonic-32                       217           5227790 ns/op         330.39 MB/s         4855946 B/op           10790 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_SonicStd
BenchmarkUnmarshalConcrete/CitmCatalog_SonicStd-32                    226           5451267 ns/op         316.84 MB/s         4921651 B/op           11525 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_GoJson
BenchmarkUnmarshalConcrete/CitmCatalog_GoJson-32                      247           4268788 ns/op         404.61 MB/s         2569721 B/op           14589 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_JsonIter
BenchmarkUnmarshalConcrete/CitmCatalog_JsonIter-32                    248           4797916 ns/op         359.99 MB/s         1276322 B/op           33896 allocs/op
BenchmarkUnmarshalConcrete/CitmCatalog_JsonIterStd
BenchmarkUnmarshalConcrete/CitmCatalog_JsonIterStd-32                 246           4731282 ns/op         365.06 MB/s         1276307 B/op           33896 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_Std
BenchmarkUnmarshalConcrete/GolangSource_Std-32                         36          33554352 ns/op          57.83 MB/s         5367024 B/op           79865 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_StdV2
BenchmarkUnmarshalConcrete/GolangSource_StdV2-32                       70          17359381 ns/op         111.78 MB/s         3622182 B/op           28939 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_Sonic
BenchmarkUnmarshalConcrete/GolangSource_Sonic-32                       92          13088101 ns/op         148.26 MB/s        20684804 B/op           13060 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_SonicStd
BenchmarkUnmarshalConcrete/GolangSource_SonicStd-32                    84          14285152 ns/op         135.84 MB/s        21156347 B/op           25867 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_GoJson
BenchmarkUnmarshalConcrete/GolangSource_GoJson-32                     152           7498697 ns/op         258.77 MB/s         4062506 B/op           13509 allocs/op
BenchmarkUnmarshalConcrete/GolangSource_JsonIter
BenchmarkMarshalInterface/StringEscaped_Std
BenchmarkMarshalInterface/StringEscaped_Std-32                               18004             65761 ns/op         639.62 MB/s           24714 B/op             126 allocs/op
BenchmarkMarshalInterface/StringEscaped_StdV2
BenchmarkMarshalInterface/StringEscaped_StdV2-32                             23498             49735 ns/op         845.72 MB/s           18441 B/op               2 allocs/op
BenchmarkMarshalInterface/StringEscaped_Sonic
BenchmarkMarshalInterface/StringEscaped_Sonic-32                            102657             10178 ns/op        4132.78 MB/s           19588 B/op               4 allocs/op
BenchmarkMarshalInterface/StringEscaped_SonicStd
BenchmarkMarshalInterface/StringEscaped_SonicStd-32                          44410             25827 ns/op        1628.59 MB/s           23112 B/op               4 allocs/op
BenchmarkMarshalInterface/StringEscaped_GoJson
BenchmarkMarshalInterface/StringEscaped_GoJson-32                            19083             61609 ns/op         682.72 MB/s           18454 B/op               1 allocs/op
BenchmarkMarshalInterface/StringEscaped_JsonIter
BenchmarkMarshalInterface/StringEscaped_JsonIter-32                          23515             49366 ns/op         852.05 MB/s           18569 B/op               5 allocs/op
BenchmarkMarshalInterface/StringEscaped_JsonIterStd
BenchmarkMarshalInterface/StringEscaped_JsonIterStd-32                       17902             66072 ns/op         636.61 MB/s           24950 B/op              73 allocs/op
BenchmarkMarshalInterface/StringUnicode_Std
BenchmarkMarshalInterface/StringUnicode_Std-32                               17967             65988 ns/op         274.66 MB/s           24714 B/op             126 allocs/op
BenchmarkMarshalInterface/StringUnicode_StdV2
BenchmarkMarshalInterface/StringUnicode_StdV2-32                             23527             49762 ns/op         364.22 MB/s           18441 B/op               2 allocs/op
BenchmarkMarshalInterface/StringUnicode_Sonic
BenchmarkMarshalInterface/StringUnicode_Sonic-32                            104732             10122 ns/op        1790.55 MB/s           19627 B/op               4 allocs/op
BenchmarkMarshalInterface/StringUnicode_SonicStd
BenchmarkMarshalInterface/StringUnicode_SonicStd-32                          44007             26454 ns/op         685.13 MB/s           23387 B/op               4 allocs/op
BenchmarkMarshalInterface/StringUnicode_GoJson
BenchmarkMarshalInterface/StringUnicode_GoJson-32                            18782             61900 ns/op         292.79 MB/s           18454 B/op               1 allocs/op
BenchmarkMarshalInterface/StringUnicode_JsonIter
BenchmarkMarshalInterface/StringUnicode_JsonIter-32                          23299             50102 ns/op         361.74 MB/s           18569 B/op               5 allocs/op
BenchmarkMarshalInterface/StringUnicode_JsonIterStd
BenchmarkMarshalInterface/StringUnicode_JsonIterStd-32                       17664             67019 ns/op         270.43 MB/s           24963 B/op              73 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_Std
BenchmarkMarshalInterface/SyntheaFhir_Std-32                                    50          20271312 ns/op          99.08 MB/s         8502030 B/op          147060 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_StdV2
BenchmarkMarshalInterface/SyntheaFhir_StdV2-32                                 200           5832199 ns/op         344.38 MB/s         1147011 B/op               2 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_Sonic
BenchmarkMarshalInterface/SyntheaFhir_Sonic-32                                 219           5350969 ns/op         375.35 MB/s         1204984 B/op               4 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_SonicStd
BenchmarkMarshalInterface/SyntheaFhir_SonicStd-32                              193           6094238 ns/op         329.57 MB/s         1466766 B/op               4 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_GoJson
BenchmarkMarshalInterface/SyntheaFhir_GoJson-32                                 99          11450354 ns/op         175.41 MB/s         2538602 B/op            6500 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_JsonIter
BenchmarkMarshalInterface/SyntheaFhir_JsonIter-32                              121           9762047 ns/op         205.75 MB/s         3079609 B/op           45293 allocs/op
BenchmarkMarshalInterface/SyntheaFhir_JsonIterStd
BenchmarkMarshalInterface/SyntheaFhir_JsonIterStd-32                            49          20932270 ns/op          95.95 MB/s         7707134 B/op          134185 allocs/op
BenchmarkMarshalInterface/TwitterStatus_Std
BenchmarkMarshalInterface/TwitterStatus_Std-32                                 187           6245363 ns/op         101.12 MB/s         2081542 B/op           32319 allocs/op
BenchmarkMarshalInterface/TwitterStatus_StdV2
BenchmarkMarshalInterface/TwitterStatus_StdV2-32                               566           2002750 ns/op         315.32 MB/s          466995 B/op               2 allocs/op
BenchmarkMarshalInterface/TwitterStatus_Sonic
BenchmarkMarshalInterface/TwitterStatus_Sonic-32                               735           1477593 ns/op         427.39 MB/s          490061 B/op               4 allocs/op
BenchmarkMarshalInterface/TwitterStatus_SonicStd
BenchmarkMarshalInterface/TwitterStatus_SonicStd-32                            523           2109534 ns/op         299.36 MB/s          579570 B/op               4 allocs/op
BenchmarkMarshalInterface/TwitterStatus_GoJson
BenchmarkMarshalInterface/TwitterStatus_GoJson-32                              279           4115291 ns/op         153.46 MB/s         1047545 B/op             542 allocs/op
BenchmarkMarshalInterface/TwitterStatus_JsonIter
BenchmarkMarshalInterface/TwitterStatus_JsonIter-32                            427           2672293 ns/op         236.32 MB/s          636993 B/op            3794 allocs/op
BenchmarkMarshalInterface/TwitterStatus_JsonIterStd
BenchmarkMarshalInterface/TwitterStatus_JsonIterStd-32                         186           6262417 ns/op         100.84 MB/s         2471316 B/op           22557 allocs/op
PASS
ok          github.com/bytedance/sonic/generic_test        320.691s

I guess it is because I added some warm-up codes before` testing —— sonic's JIT feature will slow down the first-time serialization/deserialization since it needs some time to compile the codec.

dsnet commented 2 years ago

Hi, I apologize for the delay. The benchmark code needed to be cleaned up first.

You can see it here: https://github.com/go-json-experiment/jsonbench

BTW, you mentioned SonicJSON doesn't support sorting the keys for a map[string]any. In fact, there is an option encoder.SortKeys or sonic.Config.SortKeysto support this —— so maybe you can use it on your benchmarks.

Benchmark comparisons were performed based on the default behavior of Marshal and Unmarshal since is what a vast majority of users will be using and better reflects what the end user will experience in terms of performance. No attempt was made to try to select the set of options for each implementations to have some semblance of similar behavior.

I guess it is because I added some warm-up codes before` testing —— sonic's JIT feature will slow down the first-time serialization/deserialization since it needs some time to compile the codec.

That's somewhat unfair to the other implementations. Every package has some degree of "warm-up" logic. If we want to be more fair, we should perform a single Marshal or Unmarshal (for every implementation) before the main loop of benchmarks is run. However, each benchmark runs the loop multiple times, so the cost of initialization is usually amortized away.

dsnet commented 2 years ago

I'm using a AMD Ryzen 9 5900X, and you can see my benchmark results here: https://raw.githubusercontent.com/go-json-experiment/jsonbench/master/results/results.log

mvdan commented 2 years ago

That's somewhat unfair to the other implementation. Every package has some degree of "warm-up" logic. If we want to be more fair, we should perform a single Marshal or Unmarshal (for every implementation) before the main loop of benchmarks is run. However, each benchmark runs the loop multiple times, so the cost of initialization is usually amortized away.

I would argue that we don't want to hide or amortize the warm-up cost, because it will matter for many real applications. Take go list -json, for example - users call that as a new Go process every time and it usualy only needs to encode a few JSON values, so a high warm-up cost to make each encode slightly faster might even make the whole program slower on average. Some Go programs are long-lived servers, but many others aren't.

AsterDY commented 2 years ago

Thanks for the reply. The warm-up is necessary for sonic and I added it for every test including other libraries for the seek of fairness. In practice, the JIT processes are only triggered once or within minutes for most JIT-based applications, then left running time is non-JIT processes, which means it has much less influence on long-term apps. However, your benchmarks are short-term running (no more than 10 seconds each) and it is not applicable to the production environment.

ChrisHines commented 2 years ago

Would it be fair to publish benchmarks for both scenarios so that people can consider them in the context of their use cases?

dsnet commented 2 years ago

We can always split benchmark charts across different dimensions to make them more accurate, but at the cost of making it more incomprehensible. At present, we already have 6 charts. Splitting across "startup performance" versus "steady-state performance" will change this into 12 charts. Not to mention, we haven't thrown in other reasonable dimensions to split on (e.g., different architectures). So if we split upon arm64 vs amd64, then that becomes 24 charts.

Given that:

  1. Sonic uniquely suffers the most from high startup cost,
  2. our benchmarks already give Sonic a favorable rating performance-wise, and
  3. the intention of our benchmarks is to primarily show that JSONv2 is comparable or much better than JSONv1,

I don't think we need to make this distinction between startup cost and steady-state cost. And if we're not going to distinguish between the two, including startup cost in the benchmark for all implementations is the fairest thing to do.

mvdan commented 2 years ago

I'm okay with that - I just don't want us to only show steady-state cost performance, because that is half of the equation :)