jamesmunns / postcard

A no_std + serde compatible message library for Rust
Apache License 2.0
885 stars 87 forks source link

Support both Big- and Little-Endianness #26

Closed ryan-summers closed 2 years ago

ryan-summers commented 4 years ago

It looks like postcard currently assumes that all data should be encoded/decoded in little-endian format - this makes it impossible to use postcard with other serializers/deserializers that may use different endian formats (e.g. network endian is big-endian).

It would be nice to either:

  1. Provide a flavor that would allow specified endian encoding
  2. Provide a different API to encode/decode to/from BE/LE
jamesmunns commented 3 years ago

It's important to note (for anyone else reading this), that postcard DOES work just fine on big and little endian systems, but the data format is currently ALWAYS little endian. For most cases, this is preferred, as most systems, especially MCUs supported by Rust are little endian, which means serialization is more performant.

In the future, it might be an option to configure this.

robamu commented 2 years ago

Hi,

I am very much interested in this feature. I am writing an application where packets are retrieved from a network or serialized into one , so endianness is actually an issue for me.

How difficult or easy would it be to make configure the target endianness for individual operations? I currently would need 2 libraries to define custom packets as conveniently as possible: One which allows me to serialize into a network, and one to use serde for everything else. The first one would be zerocopy right now. For example, I have the following code here:

use postcard::{from_bytes, to_stdvec, to_vec};
use serde::{Deserialize, Serialize};
use zerocopy::byteorder::{U16, I32};
use zerocopy::{FromBytes, AsBytes, Unaligned, NetworkEndian};

#[derive(AsBytes, FromBytes, Unaligned, Debug, Eq, PartialEq)]
#[repr(C)]
struct ZeroCopyTest {
    some_bool: u8,
    some_u16: U16<NetworkEndian>,
    some_i32: I32<NetworkEndian>,
    some_float: [u8; 4]
}

#[derive(Serialize, Deserialize, Debug, PartialEq)]
struct PostcardTest {
    some_bool: u8,
    some_u16: u16,
    some_i32: i32,
    some_float: f32
}

fn main() {
    let pc_test = PostcardTest {
        some_bool: true as u8,
        some_u16: 0x42,
        some_i32: -200,
        some_float: 7.7 as f32
    };

    let out = to_stdvec(&pc_test).unwrap();
    println!("{:#04x?}", out);

    let sample_hk = ZeroCopyTest {
        some_bool: true as u8,
        some_u16: U16::from(0x42),
        some_i32: I32::from(-200),
        some_float: (7.7 as f32).to_be_bytes()
    };
    let mut slice = [0; 11];
    sample_hk.write_to(slice.as_mut_slice());
    println!("{:#04x?}", slice);
}

Output:

[0x01, 0x42, 0x00, 0x38, 0xff, 0xff, 0xff, 0x66, 0x66, 0xf6, 0x40]
[0x01, 0x00, 0x42, 0xff, 0xff, 0xff, 0x38, 0x66, 0x66, 0xf6, 0x40]

Process finished with exit code 0

It would probably be possible to auto-generate both structs in some way, but it would be ideal if I could omit the first structure. Let me know if this is already possible or if there is maybe a better solution, I am still relatively new to Rust as well.

jamesmunns commented 2 years ago

Hey @robamu, especially with the 1.0 changes that make all integers varint encoded, which specifically need to be LSB first to achieve byte compression, it is fairly certain that the answer to whether postcard will ever support big-endian wire format is likely to be "no".

HOWEVER, there are already some cases where it may be necessary/preferrable to use "not varint encoded" integers, one case that came up was for folks using fp16 types which convert to u16s on the wire. This causes problems as these hit the "worst case" encoding size, often taking 3 bytes on the wire rather than 2 bytes.

For this reason, I am likely to provide integer types like FixedU16 that ALWAYS are encoded as [u8; 2] (or whatever the correct size is) on the wire. I don't see any reason why there couldn't also be FixedU16BE or FixedU16<BE> types to go along with this. If there are reasonable Into or From traits I could support behind a feature gate (e.g. Into<zerocopy::U16> for FixedU16, with the "zerocopy" feature), I'd be happy to include those.

Do you think this might be a reasonable solution for your use cases? (also CC @ryan-summers).

robamu commented 2 years ago

That sounds very promising. I guess I will lose some compression for larger structs with more fields (?) but I think it's perfectly fine for this case. I can probably avoid having structs with duplicate functionality at many places in the code.

jamesmunns commented 2 years ago

Yeah, I think you will always have to trade off "compression" for "zero copy", but I'd like to make it possible to make that choice!

The compression is actually sort of a red herring, it's a nice benefit, but it's actually a workaround for the fact that serde doesn't have usize/isize as a first-class data model item, which means it is a portability hazard. Using varints for all integers makes this possible to handle correctly.