reinerp / varlen-rs

Ergonomic variable-length types in Rust
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

Odd sizing when using smaller index types in new varlen objects #3

Closed Dekker1 closed 1 year ago

Dekker1 commented 1 year ago

In my application I'm dealing with essentially small arrays containing u32 (or at least something similar). There are many (thousands) of them, and each will need some metadata. As such as I was hoping to define it using varlen as:

#[define_varlen]
pub(crate) struct A {
    header: u16,
    // other_header: u8,
    #[varlen]
    data: Array<u32, u8>,
}

However, this seems to result in header (and other_header) in a 32 bit block, then the length of in a separate 32-bit block. I would have assumed that only the first 32-bit block would be required, and that the following test case would succeed.

let x: VBox<A> = VBox::new(a::Init {
    header: 5u16,
    // other_header: 2u8,
    data: Array::try_clone_from_slice(&[1u32]).unwrap(),
});
assert_eq!(
    x.calculate_layout().size,
    size_of::<u32>() + // headers + (maybe alignment) + size
        x.refs().data.len() * size_of::<u32>() // array elements
);

As an alternative, I though I could use the macros to define the type instead to see if that would work:

#[define_varlen]
pub(crate) struct A {
    header: u16,
    // other_header: u8,
    #[controls_layout]
    len: u8,
    #[varlen_array]
    data: [u32; *len],
}

However, this fails because of the type of len:

error[E0308]: mismatched types
  --> src/a.rs:70:14
   |
63 | #[define_varlen]
   | ---------------- expected `usize` because of return type
...
70 |     data: [u32; *len],
   |                 ^^^^ expected `usize`, found `u8`

Any help with a solution would be appreciated

reinerp commented 1 year ago

Explaining this scenario:

#[define_varlen]
pub(crate) struct A {
    header: u16,
    // other_header: u8,
    #[varlen]
    data: Array<u32, u8>,
}

In this case Array<u32, u8> has its own alignment (4 bytes) and size (((len+1)*4) bytes), both of which are independent of the struct that the Aray<u32, u8> is embedded into. This is necessary so that you can pass around an &Array<u32, u8> and it means the same thing regardless of whether it came from struct A or struct B. Unfortunately this wastes some space on padding, similar to how struct boundaries in Rust can also cause some wasted space due to padding.

You're going in the right direction with your second approach. You just need to fix the type error: [u32; *len] needs the length to be of type usize, but in your case it's of type u8. The length can be an arbitrary const-evaluatable expression, so the the solution is simple: just add the cast. The following should work, I believe:

#[define_varlen]
pub(crate) struct A {
    header: u16,
    // other_header: u8,
    #[controls_layout]
    len: u8,
    #[varlen_array]
    data: [u32; *len as u8],
}
Dekker1 commented 1 year ago

Thank you for the explanation. That does indeed solve my problem. I just have to fix the cast to go to usize in your fragment:

    data: [u32; *len as usize],

I had somehow assumed that the #[define_varlen] and #[varlen] would (or at least be able to) split the length from the data, but your explanation about being able to pass (references) around makes a lot of sense.

Maybe you could add some warnings when creating Array types where the Len type is smaller than the element type. It that constructing these combination would never make sense from a memory saving perspective.

Maybe the use-case where the length is packed with other data would also be a good (additional) example for the documentation.

Thanks for the help and for the great crate. It is exactly what I needed!