Closed c-cube closed 2 years ago
so far:
~/w/ocaml-protoc (wip-nested-write-by-writing-backward|✚1) $ ./benchs.sh -p nested.enc.10
*** Run benchmarks for path "nested.enc.10"
Throughputs for "nested-enc-basic-buffer", "nested-enc-nested-bufs", "nested-enc-write-backward" each running 4 times for at least 3 CPU seconds:
nested-enc-basic-buffer: 3.83 WALL ( 3.79 usr + 0.01 sys = 3.80 CPU) @ 33225.06/s (n=126152)
4.15 WALL ( 4.04 usr + 0.08 sys = 4.11 CPU) @ 30675.27/s (n=126152)
4.68 WALL ( 4.47 usr + 0.15 sys = 4.62 CPU) @ 27299.52/s (n=126152)
4.32 WALL ( 4.32 usr + 0.00 sys = 4.32 CPU) @ 29227.15/s (n=126152)
nested-enc-nested-bufs: 3.06 WALL ( 3.04 usr + 0.00 sys = 3.04 CPU) @ 31951.89/s (n=96990)
3.05 WALL ( 3.02 usr + 0.00 sys = 3.02 CPU) @ 33486.48/s (n=101086)
3.01 WALL ( 3.00 usr + 0.00 sys = 3.00 CPU) @ 35043.27/s (n=105177)
3.01 WALL ( 3.00 usr + 0.00 sys = 3.00 CPU) @ 35439.61/s (n=106462)
nested-enc-write-backward: 3.14 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 28495.00/s (n=89302)
3.14 WALL ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 27197.97/s (n=85344)
3.13 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 27603.14/s (n=86272)
3.13 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 27475.33/s (n=85938)
Rate nested-enc-write-backward nested-enc-basic-buffer nested-enc-nested-bufs
nested-enc-write-backward 27693+- 461/s -- [-8%] -19%
nested-enc-basic-buffer 30107+-2054/s [9%] -- -11%
nested-enc-nested-bufs 33980+-1311/s 23% 13% --
~/w/ocaml-protoc (wip-nested-write-by-writing-backward|✚1) $ ./benchs.sh -p nested.enc.5
*** Run benchmarks for path "nested.enc.5"
Throughputs for "nested-enc-basic-buffer", "nested-enc-nested-bufs", "nested-enc-write-backward" each running 4 times for at least 3 CPU seconds:
nested-enc-basic-buffer: 3.01 WALL ( 3.00 usr + 0.00 sys = 3.01 CPU) @ 60021.08/s (n=180387)
3.13 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 59669.38/s (n=186588)
3.13 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 53907.76/s (n=168603)
3.01 WALL ( 3.00 usr + 0.00 sys = 3.00 CPU) @ 56135.23/s (n=168603)
nested-enc-nested-bufs: 3.21 WALL ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 71289.96/s (n=227909)
3.17 WALL ( 3.15 usr + 0.00 sys = 3.15 CPU) @ 72363.71/s (n=227909)
3.35 WALL ( 3.32 usr + 0.00 sys = 3.32 CPU) @ 68573.32/s (n=227909)
3.52 WALL ( 3.48 usr + 0.00 sys = 3.48 CPU) @ 65399.66/s (n=227909)
nested-enc-write-backward: 3.03 WALL ( 3.01 usr + 0.00 sys = 3.01 CPU) @ 51838.81/s (n=156198)
3.13 WALL ( 3.10 usr + 0.00 sys = 3.10 CPU) @ 50321.08/s (n=156198)
3.02 WALL ( 3.02 usr + 0.00 sys = 3.02 CPU) @ 55913.69/s (n=168815)
3.13 WALL ( 3.12 usr + 0.00 sys = 3.12 CPU) @ 57412.65/s (n=179201)
Rate nested-enc-write-backward nested-enc-basic-buffer nested-enc-nested-bufs
nested-enc-write-backward 53872+-2747/s -- [-6%] -22%
nested-enc-basic-buffer 57433+-2413/s [7%] -- -17%
nested-enc-nested-bufs 69407+-2560/s 29% 21% --
there's definitely a nice little edge to the "current encoding, but re-using buffers for nested messages"
(Our earlier discussion: #161 )
and https://github.com/mransan/ocaml-protoc/pull/157 for the nested-enc-nested-bufs
bit, which seems to be quite nice actually. If we just keep the encoder around, it seems possible to serialize a ton of stuff with few allocations.
seems to not pay off, and it makes codegen harder.
this is experimental, and comes from discussions with @vphantom and others about how to efficiently deal with nested messages. Writing backwards is a clean way of doing it, but so far it seems quite subtle and not worth it.
edit: