Closed cBournhonesque closed 1 year ago
Unlike bincode
, bitcode
doesn't support serializing into a mutable packet structure or stream because performance would suffer from lack of alignment/wide-integer instructions. bitcode
only serializes into Vec<u8>
(via allocation) or &[u8]
(via &mut bitcode::Buffer
).
As a result, the minimal-allocation method is to reuse a bitcode::Buffer
(or pool of them) and copy from the resulting &[u8]
into your packet, at which point you know the number of bytes from <&[u8]>::len()
.
Feel free to give other/more specific reasons to implement this functionality, e.g. a code example, taking into account the above limitations.
I'm not sure I fully understood your comment; what I meant was a trait like this: https://github.com/naia-lib/naia/blob/main/shared/serde/src/serde.rs#L4
Where there could be an additional function that simply returns the amount of bytes that the struct/enum will be serialized into, but without doing the actual serialization. For example via these kinds of implementations: https://github.com/naia-lib/naia/blob/main/shared/serde/src/impls/string.rs#L28
For example via these kinds of implementations: https://github.com/naia-lib/naia/blob/main/shared/serde/src/impls/string.rs#L28
Thanks for providing a code example! It looks like you are using the bit length to decide whether to serialize the message at all, which could legitimately benefit from the functionality.
(Edit: FWIW, I tried implementing the desired functionality on the predict_len
branch).
I avoided adding something similar to bincode::serialized_size
since I've noticed lots of people misuse it to allocate buffers with capacity as an optimization. This usually results in half the performance and double the binary size for everything but the most trivial structures (see https://github.com/bincode-org/bincode/issues/401).
it would be useful to have a function to know how many bits/bytes a structure would take if it were encoded, but without doing the actual encoding (so that i know in which packet i can put the encoded data).
I would advise serializing each structure to a Vec<u8>
with bitcode::encode
and then appending as many as possible to another Vec<u8>
, each with a length prefix such as a u16
or u32
. The length prefix is required so you can pass a &[u8]
of the original structure length to bitcode::decode
.
While copying the bytes isn't ideal, it should be much faster than something like serialized_size
.
@caibear brings up some good points against implementing this and a possible alternative for your code.
Here is one more possible alternative for you, in the form of code that you can drop in to your project:
use std::cell::RefCell;
use serde::Serialize;
use bitcode::{Encode, Buffer, Error};
// for serde::Serialize
fn serialize_len<T: Serialize + ?Sized>(t: &T) -> Result<usize, Error> {
thread_local! {
static BUFFER: RefCell<Option<Buffer>> = RefCell::new(None);
}
BUFFER.with(|buffer| {
let mut buffer = buffer.borrow_mut();
if buffer.is_none() {
*buffer = Some(Default::default());
}
buffer.as_mut().unwrap().serialize(t).map(|bytes| bytes.len())
})
}
// for bitcode::Encode
fn encode_len<T: Encode + ?Sized>(t: &T) -> Result<usize, Error> {
thread_local! {
static BUFFER: RefCell<Option<Buffer>> = RefCell::new(None);
}
BUFFER.with(|buffer| {
let mut buffer = buffer.borrow_mut();
if buffer.is_none() {
*buffer = Some(Default::default());
}
buffer.as_mut().unwrap().encode(t).map(|bytes| bytes.len())
})
}
Use these as a last resort if you can't refactor your code as suggested by @caibear. By reusing the Buffer
, they avoid repeated memory allocations. They don't require additional codegen and won't be significantly slower than my predict_len
changes mentioned above.
Thank you! In general i'll be encoding everything in a buffer of size UDP_PACKET_SIZE (around 1400 bytes), so i wouldn't be using this to optimize allocations. Both options that you provided make sense to me.
Hi,
I'd like to use bitcode for games networking; and it would be useful to have a function to know how many bits/bytes a structure would take if it were encoded, but without doing the actual encoding (so that i know in which packet i can put the encoded data).
Something similar to https://docs.rs/bincode/latest/bincode/fn.serialized_size.html