near / borsh-rs

Rust implementation of Binary Object Representation Serializer for Hashing
https://borsh.io/
Apache License 2.0
324 stars 70 forks source link

Code size and dynamic dispatch #81

Open alessandrod opened 2 years ago

alessandrod commented 2 years ago

Not sure this is the best place to have this conversation so feel free to close the issue!

I've been working on reducing (compiled) code size for an API that uses borsh heavily. This API includes a bunch of functions that use BorshSerialize generics. Between monomorphized copypasta, bad inlining and the error handling code for all the Results returned by all the BorshSerialize::serialize() calls, these functions end up compiling down to a whole lot of code.

When I first noticed the problem I thought oh well it's a shame, but I guess I'll just make these functions use dynamic dispatch. Except BorshSerialize isn't object safe, duh. So, armed with sed and a healthy dose of frustration, I hacked up a BorshDynSerialize trait and derive that are exactly like BorshSerialize, but use dynamic dispatch for the writer so are object safe.

Switching my project from BorshSerialize generics to dynamic dispatch with BorshDynSerialize, cuts down .text size of about 50k (of 390k total). Ugh.

I realize this probably only matters for embedded and fringe programs that care about code size - but has this issue come up before? Has anyone thought about providing an API that is usable with dynamic dispatch?

matklad commented 2 years ago

This also matters quite a bit for the primary use-case of borsh -- smart contracts compiled for WASM. And yeah, what you are saying makes sense -- there's indeed a tradeoff here between code-bloat and runtime performance (see serde vs deser (https://github.com/mitsuhiko/deser)).

I don't think anyone fully explored these tradeoffs for borsh though.

An obvious short-term win would be to fork borsh and publish a borsh-dyn crate which implements the same data format as borsh, but using a different implementation strategy. Would gladly link that from this crate's readme/docs!

In an ideal world, this would be a feature of the main borsh crate, but I think exentining existing APIs to support that nicely wouldn't be easy, mostly because today's APIs don't seem to be in a particularly good shape: https://github.com/near/borsh-rs/issues/51.

alessandrod commented 2 years ago

An obvious short-term win would be to fork borsh and publish a borsh-dyn crate which implements the same data format as borsh, but using a different implementation strategy. Would gladly link that from this crate's readme/docs!

Sounds like a smart way to go about this. Ideally at least for the common case with derived impls, it could be made to be an almost drop in replacement. I'll give this a go and see how it feels.

In an ideal world, this would be a feature of the main borsh crate, but I think exentining existing APIs to support that nicely wouldn't be easy, mostly because today's APIs don't seem to be in a particularly good shape: https://github.com/near/borsh-rs/issues/51.

Yeah I was actually aiming for that initially, but quickly realized it would get pretty messy.