issuu / ocaml-protoc-plugin

ocaml-protoc-plugin
https://issuu.github.io/ocaml-protoc-plugin/
Other
48 stars 19 forks source link

Speed up serialization and deserialization #55

Closed andersfugmann closed 7 months ago

andersfugmann commented 8 months ago

Benchmarking against ocaml-protoc showed a huge number of allocations for both serialization and deserialization, which prompted some work on speeding up serialization and deserialization:

For simple structures, ocaml-protoc-plugin is still slower than ocaml-protoc, but for more complex structures speed is comparable.

I'll note that ocaml-protoc has chosen to be non-compliant, and uses c stubs for faster encoding and decoding. Some observations on ocaml-protoc:

andersfugmann commented 8 months ago

I don't really know how the changes can be reviewed, as they are massive! I think the tests are pretty complete and tests a lot of things, as well as the benchmarks which also verifies results.

andersfugmann commented 8 months ago

@AndreasDahl Any suggestions on how we can get this PR merged? I understand that its huge, but I believe the test coverage quite high. A suggestion would be to focus on coding style and comment if the libraries needed for running the benchmark (make bench) should be added to the opam file.

andersfugmann commented 7 months ago

With the latest optimizations, ocaml-protoc-plugin is now comparable with protoc:

Numbers below shows relation to protoc. A value above one means that Ocaml-protoc-plugin is slower and below means faster. The benchmark is run with flambda settings: -O3 -unbox-closures -remove-unused-arguments -rounds 3 -inline 100.00 -inline-max-depth 3 -inline-max-unroll 3

name protoc plugin ratio
string_list.M/Decode 45673.3403 ns/run 53547.6444 ns/run 1.172
string_list.M/Encode 77058.1932 ns/run 46928.1161 ns/run .608
float_list.M/Decode 11195.2996 ns/run 14448.8059 ns/run 1.290
float_list.M/Encode 19415.1014 ns/run 8767.3420 ns/run .451
int64_list.M/Decode 16906.9672 ns/run 14329.7493 ns/run .847
int64_list.M/Encode 13556.7060 ns/run 11329.0867 ns/run .835
enum.M/Decode 31.9749 ns/run 72.6818 ns/run 2.273
enum.M/Encode 68.3581 ns/run 59.7913 ns/run .874
string.M/Decode 50.1519 ns/run 91.4369 ns/run 1.823
string.M/Encode 77.2197 ns/run 86.7556 ns/run 1.123
float.M/Decode 28.6674 ns/run 72.1196 ns/run 2.515
float.M/Encode 66.3969 ns/run 62.3008 ns/run .938
int64.M/Decode 30.8383 ns/run 72.8756 ns/run 2.363
int64.M/Encode 66.0895 ns/run 58.9611 ns/run .892
bench.M/Decode 157661.3070 ns/run 151011.1603 ns/run .957
bench.M/Encode 217168.8299 ns/run 123210.2917 ns/run .567

Benchmark still shows x2 slower decoding on simple messages (messages with only one field).

andersfugmann commented 7 months ago

@AndreasDahl Repinging on this again. I'll stop adding commits to this for now, and I'll look into some of the other issues. However, I'd like to get this merged soon, as I intend to base future work on top of this. Could I ask for a quick review? I don't know if you have some tests that you can run on top of this code (if its still being used actively), but I don't think an in-depth review is sensible at this point with all the changes made.

andersfugmann commented 7 months ago

@hongy20 (as representing the @issue/platypus team who are marked as CODEOWNERS) , @AndreasDahl do you have suggestions on how to proceed on this?

I propose that we either.

  1. We agree that as a general rule all code should be reviewed, and you will prioritizing reviewing.
  2. We don't really care about having code on master reviewed. Remove the CODEOWERS and I will just push to master at will
  3. Transfer the repository to me and you can create / maintain a local fork.

In case I don't get a timely response, I will start merging to main and prepare a new release, but its really not an ideal solution. I would much rather prefer 3. if Issuu has no interest in maintaining these repositories. I would propose the same for ocaml-zmq repository which also have a stale PR. ocaml-zmq repository should be transferred to the ocaml community though.

andersfugmann commented 7 months ago

Note that I really do not prefer solution 2 and think we should avoid if possible.