ruby / json

JSON implementation for Ruby
https://ruby.github.io/json
Other
705 stars 332 forks source link

Use batch APIs to create Array and Hash objects #678

Closed byroot closed 3 weeks ago

byroot commented 3 weeks ago

Naively appending elements into RArray or RHash is inneficient because it might cause multiple reallocations and rehasing.

So it's preferable to accumulate all the elements onto a stack, and then use batch APIs to directly create right sized containers.

TODO:

Before:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   779.000 i/100ms
                  oj   799.000 i/100ms
          Oj::Parser   953.000 i/100ms
           rapidjson   630.000 i/100ms
Calculating -------------------------------------
                json      7.989k (± 0.7%) i/s  (125.17 μs/i) -     40.508k in   5.070571s
                  oj      7.931k (± 1.8%) i/s  (126.09 μs/i) -     39.950k in   5.039171s
          Oj::Parser      9.624k (± 0.7%) i/s  (103.91 μs/i) -     48.603k in   5.050694s
           rapidjson      6.287k (± 0.3%) i/s  (159.05 μs/i) -     31.500k in   5.010181s

Comparison:
                json:     7989.2 i/s
          Oj::Parser:     9623.6 i/s - 1.20x  faster
                  oj:     7930.8 i/s - same-ish: difference falls within error
           rapidjson:     6287.3 i/s - 1.27x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    66.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    673.530 (± 0.7%) i/s    (1.48 ms/i) -      3.432k in   5.095837s
                  oj    620.473 (± 0.5%) i/s    (1.61 ms/i) -      3.162k in   5.096259s
          Oj::Parser    767.687 (± 0.9%) i/s    (1.30 ms/i) -      3.900k in   5.080601s
           rapidjson    553.048 (± 1.1%) i/s    (1.81 ms/i) -      2.805k in   5.072525s

Comparison:
                json:      673.5 i/s
          Oj::Parser:      767.7 i/s - 1.14x  faster
                  oj:      620.5 i/s - 1.09x  slower
           rapidjson:      553.0 i/s - 1.22x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    38.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    47.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    381.312 (± 0.5%) i/s    (2.62 ms/i) -      1.938k in   5.082614s
                  oj    328.735 (± 2.1%) i/s    (3.04 ms/i) -      1.666k in   5.070407s
          Oj::Parser    458.938 (± 0.9%) i/s    (2.18 ms/i) -      2.303k in   5.018529s
           rapidjson    376.744 (± 1.3%) i/s    (2.65 ms/i) -      1.900k in   5.044113s

Comparison:
                json:      381.3 i/s
          Oj::Parser:      458.9 i/s - 1.20x  faster
           rapidjson:      376.7 i/s - same-ish: difference falls within error
                  oj:      328.7 i/s - 1.16x  slower

After:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   960.000 i/100ms
                  oj   796.000 i/100ms
          Oj::Parser   969.000 i/100ms
           rapidjson   636.000 i/100ms
Calculating -------------------------------------
                json      8.957k (± 0.5%) i/s  (111.65 μs/i) -     45.120k in   5.037777s
                  oj      7.966k (± 0.5%) i/s  (125.53 μs/i) -     40.596k in   5.096207s
          Oj::Parser      9.579k (± 0.3%) i/s  (104.39 μs/i) -     48.450k in   5.057822s
           rapidjson      6.261k (± 8.9%) i/s  (159.73 μs/i) -     31.800k in   5.182342s

Comparison:
                json:     8956.5 i/s
          Oj::Parser:     9579.3 i/s - 1.07x  faster
                  oj:     7966.2 i/s - 1.12x  slower
           rapidjson:     6260.6 i/s - 1.43x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    82.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    77.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    803.998 (± 0.6%) i/s    (1.24 ms/i) -      4.100k in   5.099692s
                  oj    608.292 (± 0.8%) i/s    (1.64 ms/i) -      3.100k in   5.096566s
          Oj::Parser    760.206 (± 0.5%) i/s    (1.32 ms/i) -      3.850k in   5.064529s
           rapidjson    549.562 (± 0.5%) i/s    (1.82 ms/i) -      2.750k in   5.004166s

Comparison:
                json:      804.0 i/s
          Oj::Parser:      760.2 i/s - 1.06x  slower
                  oj:      608.3 i/s - 1.32x  slower
           rapidjson:      549.6 i/s - 1.46x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    43.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    47.000 i/100ms
           rapidjson    36.000 i/100ms
Calculating -------------------------------------
                json    447.336 (± 0.9%) i/s    (2.24 ms/i) -      2.279k in   5.094945s
                  oj    336.266 (± 2.4%) i/s    (2.97 ms/i) -      1.700k in   5.058625s
          Oj::Parser    466.559 (± 1.3%) i/s    (2.14 ms/i) -      2.350k in   5.037637s
           rapidjson    392.039 (± 0.8%) i/s    (2.55 ms/i) -      1.980k in   5.050826s

Comparison:
                json:      447.3 i/s
          Oj::Parser:      466.6 i/s - 1.04x  faster
           rapidjson:      392.0 i/s - 1.14x  slower
                  oj:      336.3 i/s - 1.33x  slower
byroot commented 3 weeks ago

After having fixed the various bugs, I was able to properly run all 3 macro-benchmark:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   795.000 i/100ms
                  oj   797.000 i/100ms
          Oj::Parser   966.000 i/100ms
           rapidjson   622.000 i/100ms
Calculating -------------------------------------
                json      7.993k (± 0.5%) i/s  (125.10 μs/i) -     40.545k in   5.072378s
                  oj      7.973k (± 0.4%) i/s  (125.42 μs/i) -     40.647k in   5.098065s
          Oj::Parser      9.677k (± 0.4%) i/s  (103.34 μs/i) -     49.266k in   5.091343s
           rapidjson      6.348k (± 1.0%) i/s  (157.53 μs/i) -     32.344k in   5.095766s

Comparison:
                json:     7993.5 i/s
          Oj::Parser:     9676.6 i/s - 1.21x  faster
                  oj:     7973.1 i/s - same-ish: difference falls within error
           rapidjson:     6347.8 i/s - 1.26x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    67.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    677.122 (± 0.3%) i/s    (1.48 ms/i) -      3.417k in   5.046397s
                  oj    616.088 (± 1.3%) i/s    (1.62 ms/i) -      3.100k in   5.032716s
          Oj::Parser    769.220 (± 0.3%) i/s    (1.30 ms/i) -      3.900k in   5.070107s
           rapidjson    556.188 (± 0.5%) i/s    (1.80 ms/i) -      2.805k in   5.043436s

Comparison:
                json:      677.1 i/s
          Oj::Parser:      769.2 i/s - 1.14x  faster
                  oj:      616.1 i/s - 1.10x  slower
           rapidjson:      556.2 i/s - 1.22x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    38.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    47.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    379.337 (± 1.1%) i/s    (2.64 ms/i) -      1.900k in   5.009193s
                  oj    330.563 (± 2.1%) i/s    (3.03 ms/i) -      1.666k in   5.042279s
          Oj::Parser    455.745 (± 0.7%) i/s    (2.19 ms/i) -      2.303k in   5.053560s
           rapidjson    376.186 (± 0.3%) i/s    (2.66 ms/i) -      1.900k in   5.050766s

Comparison:
                json:      379.3 i/s
          Oj::Parser:      455.7 i/s - 1.20x  faster
           rapidjson:      376.2 i/s - same-ish: difference falls within error
                  oj:      330.6 i/s - 1.15x  slower

this branch:

== Parsing activitypub.json (58160 bytes)
ruby 3.4.0dev (2024-11-02T21:25:16Z master 3e2ee99057) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json   924.000 i/100ms
                  oj   784.000 i/100ms
          Oj::Parser   967.000 i/100ms
           rapidjson   625.000 i/100ms
Calculating -------------------------------------
                json      9.004k (± 0.2%) i/s  (111.06 μs/i) -     45.276k in   5.028445s
                  oj      7.879k (± 0.3%) i/s  (126.93 μs/i) -     39.984k in   5.075038s
          Oj::Parser      9.594k (± 0.7%) i/s  (104.24 μs/i) -     48.350k in   5.040009s
           rapidjson      6.268k (± 0.7%) i/s  (159.55 μs/i) -     31.875k in   5.085837s

Comparison:
                json:     9004.0 i/s
          Oj::Parser:     9593.7 i/s - 1.07x  faster
                  oj:     7878.6 i/s - 1.14x  slower
           rapidjson:     6267.8 i/s - 1.44x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.4.0dev (2024-11-02T21:25:16Z master 3e2ee99057) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    82.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    815.532 (± 0.6%) i/s    (1.23 ms/i) -      4.100k in   5.027605s
                  oj    613.059 (± 0.3%) i/s    (1.63 ms/i) -      3.100k in   5.056666s
          Oj::Parser    782.689 (± 0.3%) i/s    (1.28 ms/i) -      3.978k in   5.082512s
           rapidjson    556.979 (± 1.6%) i/s    (1.80 ms/i) -      2.800k in   5.028521s

Comparison:
                json:      815.5 i/s
          Oj::Parser:      782.7 i/s - 1.04x  slower
                  oj:      613.1 i/s - 1.33x  slower
           rapidjson:      557.0 i/s - 1.46x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.0dev (2024-11-02T21:25:16Z master 3e2ee99057) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    40.000 i/100ms
                  oj    32.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    37.000 i/100ms
Calculating -------------------------------------
                json    415.866 (± 0.2%) i/s    (2.40 ms/i) -      2.080k in   5.001622s
                  oj    336.376 (± 0.9%) i/s    (2.97 ms/i) -      1.696k in   5.042349s
          Oj::Parser    466.360 (± 0.6%) i/s    (2.14 ms/i) -      2.365k in   5.071387s
           rapidjson    392.459 (± 1.0%) i/s    (2.55 ms/i) -      1.998k in   5.091626s

Comparison:
                json:      415.9 i/s
          Oj::Parser:      466.4 i/s - 1.12x  faster
           rapidjson:      392.5 i/s - 1.06x  slower
                  oj:      336.4 i/s - 1.24x  slower