cardano-scaling / hydra

Implementation of the Hydra Head protocol
https://hydra.family/head-protocol/
Apache License 2.0
274 stars 84 forks source link

Binary Protocols for disk storage and websocket #1585

Open Quantumplation opened 3 weeks ago

Quantumplation commented 3 weeks ago

Why

Right now, the websocket protocol and file persistence use JSON; this is fine for most use cases, but one big bottleneck we noticed in the Hydra Doom project was that this quickly becomes a problem. In just a few days, the nodes had produced over 10 terabytes of on-disk state, and the added JSON overhead inflated our ~480 byte transactions to several kilobytes; at 200 transactions per second per node, that is a significant amount of overhead.

What

It would be nice/convenient to have binary protocols for both of these things for scenarios that need to squeeze that much more out of the performance of the hydra node.

How

I'm not sure how this interplays with the plans to potentially use something like a postgres backend for persistence, but the source/sink APIs we recently contributed would be well suited for providing alternative implementations of these protocols.

Quantumplation commented 3 weeks ago

(Perhaps this should have been a discussion before an issue, woops)

ch1bo commented 2 weeks ago

@Quantumplation That's fine. Thanks for contributing this idea. The purpose is clear: "Reduce JSON overhead".

What we should do about it is a bit less so. I currently see at least one drawback: switching from human-readable form to a binary encoding has the drawback of being impossible to debug without additional tooling.

Maybe an alternative way to reach the same purpose (at least one step) would be to only store the event stream (state file) and have the API outputs be an interpretation/subset of those. We should track this in an alternative idea (with a similar purpose) though.