hecatia-elegua / bilge

Use bitsized types as if they were a feature of rust.
Apache License 2.0
171 stars 17 forks source link

pack/unpack a `bilge` struct to/from `&[u8]` #33

Open jimy-byerley opened 1 year ago

jimy-byerley commented 1 year ago

Hello, that's me again What is the best way to safely pack/unpack a bilge struct to/from a byte slice ?

I have found the following in the readme, but not exactly what I am looking for:

#[bitsize(N)]
struct MyStruct {}

// safe way to cast from/to integers
let v = MyStruct::new(0_uN)
let n = v.value()
// unsafe way to cast from/to byte array
let v = unsafe{ std::mem::transmute<&[u8; N], &MyStruct>() }
let a = unsafe{ std::mem::transmute<&MyStruct, &[u8; N]>() }

Did I missed something for safely doing the same with byte slices or byte arrays ?

hecatia-elegua commented 1 year ago

Uhh, where in the readme did you find this? Also the example is wrong, it would be more like [u8; N/8]. I didn't play around with transmute yet, so if that fucks up the ordering, please tell me.

Besides that, from can be optimized well (sadly it's not const unless you can use a slightly older nightly version). So you can use this:

let my_struct = MyStruct::from(0);
uN::from(my_struct).to_le_bytes();

(limited to multiples of a byte)

jimy-byerley commented 1 year ago

I might have found this in the tests and examples as well :sweat_smile:

Also the example is wrong, it would be more like [u8; N/8]

Sure

Isn't using uN::to_le_bytes a problem, since bilge already set the bit and byte ordering an its way, converting the resulting uN to some specific ordering might break the things ? Also this code is specific to the actual struct size, since one must know the uN type matching the struct ... This cannot be used in generic code and might change each time the struct size changes

hecatia-elegua commented 1 year ago

on le:

specific to the actual struct size

yeah, not sure how this is supposed to work otherwise... do you want a stream of bytes - when does it stop? should it only be possible for multiples of bytes?

jimy-byerley commented 1 year ago

Hum .. so to_XX_bytes does not convert when the endianness is XX ... :thinking: assuming we have a bilge struct named MyStruct and as a user I do not know whether it is internally little- or big- endian, which one of to_le_bytes or to_be_bytes should I use to the the byte content of the struct ?

Likewise, assuming I have MyStruct and as a user I do not know its byte size, how can I convert it to bytes ?

I'm asking all this coming from crates packed_struct and packing which both implement a packing/unpacking trait for their structs. So every struct has methods pack(Self) -> [u8;N] and unpack([u8; N]) -> Self (or similar) we can call without worrying about the structs layout. It allows to write code that is not aware of the struct's internals. It also enables the use of these structs in generics. Coming from this I can witness it is convenient :) but bilge has a greater syntax and is more efficient, so I'm trying to find how to use bilge in places I used packing before

hecatia-elegua commented 1 year ago

See to_le_bytes and to_le.

and as a user I do not know whether it is internally little- or big- endian

The author of the struct would need to provide a method to do that. For generics we currently have a trait Bitsized, but honestly I haven't played around with this usecase. I'd have to see what pack/unpack does if the struct is not a byte-multiple (edit: you specify exact bit and byte ranges, so everything else is probably zero).

I only found generic usage here: https://github.com/ethercrab-rs/ethercrab/blob/9b55183df07691c00465ebe7e253a4eb71c27b0a/src/slave/mod.rs#L213 So that could be an extension trait to Bitsized which provides byte conversion, or something like that. Bitsized contains the size in BITS and the MAX value of its underlying uN.

https://github.com/ethercrab-rs/ethercrab/blob/9b55183df07691c00465ebe7e253a4eb71c27b0a/src/generate.rs#L28 That as well. I'm thinking bilge should just do as little as possible, just to define bitfields well. If byte-array handling can be moved outside, that would be great.

jimy-byerley commented 1 year ago

only found generic usage here

In ethercrab, There is also the traits PduData and PduRead which are doing the same for other structs. You would find a lot of places where they are used.

I'm thinking bilge should just do as little as possible, just to define bitfields well. If byte-array handling can be moved outside, that would be great.

Well, okay. Do you think I could eventually define such trait with a derive to add aside FromBits and FromDebug in struct attributes :thinking: ?

hecatia-elegua commented 1 year ago

Yes. I also would be up to provide a feature-gated thing to do this. I would invite you to try defining such a trait first and putting up a draft PR, if you'd like. Not sure how the design would be best accomplished, e.g. I would limit the trait to structs and enums with underlying type as multiples of u8 (u24 does have to_le_bytes etc.), but that might not be possible with the current types.

impl ByteArray for Struct
where
    Struct: Bitsized,
{
    fn to_le_bytes(self) -> [u8; Self::BITS >> 3] { // the size needs to be calculated differently for non byte-multiples
        // arbitrary-int just provides these for byte-multiples, otherwise it's not clear how to fill up e.g. u4's 4 missing bits
        Self::ArbitraryInt::from(self).to_le_bytes()
        // self.value.to_le_bytes() //would also work
    }
}
hecatia-elegua commented 1 year ago

We will solve this together with #7.

@jimy-byerley packing/packed_struct works on byte arrays, so only on things which are a multiple of u8. With arbitrary_int and bilge you can define things smaller than u8. That's the main thing I wonder how to solve still.

Since we'll add this based on a derive, we can start by just supporting structs which are a multiple of u8.

jimy-byerley commented 1 year ago

Good news ! For structures smaller than u8, or structures that are not multiple of u8, I imagine filling the rest with zeroes is also acceptable in my opinion.

By the way, I'm using bilge now in a project named etherage Where you can find many places like here where I'm defining bilge structs, and calling a custom macro to add them an implementation of trait PduData for casting them front/to &[u8]. That is my workaround at time moment for dealing with this issue.

hecatia-elegua commented 1 year ago

@jimy-byerley haha nice, I looked into etherage a little already. Filling with zeroes would be the idea, since we convert the whole thing at once anyways. Just had this thought: struct Inner(u4) == u8 and struct Struct(Inner, Inner) == u8