Closed detly closed 3 years ago
Please confirm that my comment is correct ie. that packed serialisation will never require more memory than unpacked. Then you at least have the guarantee that a buffer of size capn_size() is suitable for both packed and unpacked calls.
This is not the case (and is in general impossible for any compression scheme to guarantee, because of the pidgeonhole principle). Packing optimizes for the common case where there are many zero bytes, but may actually bloat the message a bit if there are very few zeros.
The docs (https://capnproto.org/encoding.html#packing) do put an upper bound on the overhead:
the worst-case space overhead of packing is 2 bytes per 2 KiB of input
...though I would want to sanity check that the C implementation actually computes the optimal encoding before relying on that.
I haven't looked closely at the code portion of this, and probably won't find time to (and I'm not familiar with the C implementation in particular), but I stumbled over here from the mailing list and figured I could at least point this out. Happy hacking.
This is not the case (and is in general impossible for any compression scheme to guarantee, because of the pidgeonhole principle).
Ah nuts, I actually knew this too but didn't make the connection.
For now I might just say that there is no function to compute the size for a packed buffer. At least this addresses part of the need. The packed case could be done in the same way that sprintf
works ie. do a serialisation pass without writing anything to compute the size. Not super performant, but perhaps offset by the fact that packing is there for when your storage or bandwidth costs already outweigh your computational costs.
This is a first pass at implementing what's requested in #26. It adds a function
capn_size()
that calculates the size required for a buffer passed tocapn_write_mem()
.Notes:
I did not implement the calculation for a packed structure. I couldn't figure out how to do it without doing the serialisation first, which requires a buffer of the right size, which is a circular problem.
Please confirm that my comment is correct ie. that packed serialisation will never require more memory than unpacked. Then you at least have the guarantee that a buffer of size
capn_size()
is suitable for both packed and unpacked calls.I did not change the existing tests for
capn_write_mem()
, since that seemed like testing two functions in one test. Using a pre-sized buffer for those tests seems like the right thing to do even with a size function.