Open HofmannZ opened 3 months ago
Hi @HofmannZ ! Thanks for filing the issue!
That is an interesting but unfortunate result...
The biggest change in ts-proto v2 is that we'd migrated to a different binary serialization library, @bufbuild/protobuf
, as the previous library we sat on top of, protobuf.js, has been stagnant (albeit technically maintained) for quite awhile, around issues like ESM, Bigint, and custom type support that, over the years, have been painful to deal with.
So, in theory this move was a big win in terms of a getting a nice/needed refresh to our overall stack -- but definitely here you about the performance regression.
@timostamm do you have thoughts/comments on ^?
Fwiw @HofmannZ the benchmark repo you linked to might still be private? The link 404s for me...
@stephenh - My apologies! I had mistakenly linked to a private repo, but it's now public.
We've been waiting for resizable ArrayBuffers - there's potential for a significant boost in write performance. They aren't widely enough available yet, though.
I think that conditional imports like they are used in protobuf.js should be avoided because of issues with bundlers and ESM, but it's possible that perf could be improved without that.
It would be interesting to take a look at flame graphs to see where the time is spent compared to the protobuf.js reader and writer. I'm happy to help take a look and make improvements. ts-proto, protobuf-ts, and protobuf-es would all benefit from that.
Selfishly that would be really great & appreciated @timostamm ! :-D
Fwiw @HofmannZ since this is important to you, would also be great ofc if you wanted to analyze/dive deeper into the protobufjs/buildbuf flamegraphs, and I assume you can stay on ts-proto 1.x until the performance is back on par.
Thanks!
Hi @stephenh and @timostamm,
Thanks for the insights—much appreciated!
I've done some preliminary research on resizable ArrayBuffers
, and they seem particularly promising for improving performance in protocol buffer processing. Here’s why they might be beneficial in this context:
Dynamic Memory Management for Protobuf Messages: Protocol buffers often deal with messages of varying sizes. With traditional fixed-size ArrayBuffers
, we either need to over-allocate memory or handle complex resizing logic. Resizable ArrayBuffers
would allow the buffer to grow dynamically as more data is encoded or decoded, reducing the need for costly reallocations and memory copying during serialization and deserialization.
Performance Gains in Write Operations: When encoding large or complex protocol buffer messages, resizable ArrayBuffers
can optimize memory usage by expanding only when necessary. This can lead to faster write operations, as the overhead of managing buffer sizes manually is minimized, directly impacting performance where it matters most in encoding tasks.
Simplified Buffer Management in Protobuf Implementations: Managing memory efficiently is crucial in high-performance applications. Resizable ArrayBuffers
can help simplify the implementation of protocol buffers by removing the need for pre-allocating large buffers or implementing custom resizing logic, making the codebase cleaner and less error-prone.
Alignment with Future Protocol Buffer Features: As protocol buffers evolve to support more complex data types and larger messages, leveraging resizable ArrayBuffers
can ensure that the library remains capable of handling these future demands efficiently, keeping pace with modern JavaScript capabilities.
However, I noticed that resizable ArrayBuffers
are only supported starting from Node.js version 20. Adopting them would mean dropping support for Node.js version 18, which might be a significant decision for the project. It’s definitely something to weigh as we consider optimization strategies.
Regarding the flame graph analysis, I'm keen to contribute and would love your guidance on a few points:
Excluding Warm-up Rounds: My benchmark code includes warm-up rounds that shouldn't influence the analysis. I plan to exclude these from the flame graph profiling by starting data capture after a set number of iterations. If you have preferred methods or tools for this, I'd appreciate any recommendations.
Focus on Key Iterations: Given that the benchmarks run 10,000 times, I’m considering focusing the flame graph analysis on a representative subset of iterations to avoid excessive data and make it easier to identify performance bottlenecks. What would you suggest as the best approach here?
Comparative Flame Graphs: Creating flame graphs for both protobuf.js
and @bufbuild/protobuf
could provide valuable insights into where performance differences arise. If this sounds useful, I can generate these profiles and share them for further discussion.
I'd appreciate any suggestions or best practices you might have for generating and interpreting these flame graphs effectively. I'm committed to helping find a solution that brings the performance of ts-proto v2
back on par with v1.
Hey @HofmannZ ! Thanks for the questions/context on the flamegraphs...
Honestly when I've done flamegraphs in the past, the issue is either pretty obvious ... or it's not, so I'm not sure some of the fancier "focus on key iterations"/etc approaches are necessary.
Hopefully just seeing protobufjs and @bufbuild/protobuf flamegraphs side-by-side will make the issue easy to see; granted, sometimes it doesn't, if it's an issue that is "death by a thousand cuts" and not an obvious/single hotspot.
But, side by side flamegraphs does seem like the best place to start. Thanks!
@stephenh - Sorry for the delay in getting back to this! I was swamped with the US tax deadline (October 15), so it took me a while to follow up.
I’ve now added support for creating flame graphs in our benchmark repository: https://github.com/expatfile/protobuf-benchmarks/commit/5ee4b5b08da1cecbbb6e62f802651d334c4d08a1
To help narrow down the performance issues, I suggest you generate flame graphs using ./run_flame_ts_proto.sh
from the root of the benchmark project. Any additional insights would be great!
@stephenh @timostamm - Can you indicate how we can move this item forward?
@HofmannZ realistically the best way for this to move forward is for you to dig in, find the hot spot(s), and open a PR, either to ts-proto or the underlying bufbuild/protobuf library.
This is just the way open source works--if maintainers don't have the time/resources/"itch" to fix issues personally, then it's up to contributors/potential contributors.
Thanks!
Hello
ts-proto
maintainers,First, I want to thank you for your work on
ts-proto
—it's been an invaluable tool for our projects. We recently benchmarked three libraries (ts-proto
,google-protobuf
, andprotobuf-ts
) as part of our ongoing efforts to optimize our performance. Based on those benchmarks, we initially selectedts-proto
(v1) as it outperformed the other libraries.Recently, we decided to include
ts-proto
v2 in our benchmark tests to evaluate any improvements or changes in performance. However, we were surprised to observe a significant performance regression ints-proto
v2 compared to v1.Here are the benchmark results:
google-protobuf
v3.21.4:readInputs
: 7,184 ops/sec (139,186 ns average time)updateInputs
: 6,109 ops/sec (163,671 ns average time)ts-proto
v1.181.2:readInputs
: 7,505 ops/sec (133,227 ns average time)updateInputs
: 6,353 ops/sec (157,404 ns average time)ts-proto
v2.0.3:readInputs
: 6,372 ops/sec (156,918 ns average time)updateInputs
: 4,624 ops/sec (216,218 ns average time)protobuf-ts
v2.9.4:readInputs
: 6,252 ops/sec (159,943 ns average time)updateInputs
: 4,435 ops/sec (225,466 ns average time)As you can see, the performance of
ts-proto
v2 is significantly lower than v1, particularly in theupdateInputs
task, where the operations per second dropped by almost 27%, and the average time increased by about 37%.Steps to Reproduce:
ts-proto
v1 and v2 using our test suite.readInputs
andupdateInputs
.Expected Behavior:
We expected
ts-proto
v2 to at least maintain the performance levels of v1, if not improve upon them.Actual Behavior:
The performance of
ts-proto
v2 is significantly lower, particularly in theupdateInputs
task.Environment:
ts-proto
versions: v1.181.2 and v2.0.3Additional Context:
We would greatly appreciate any insights into what might be causing this regression. If there's anything specific in the new version that we should be aware of, or if there are configuration changes we can make to mitigate this, please let us know.
Thank you for your time and for all the work you put into maintaining this library!