Lokathor / bytemuck

A crate for mucking around with piles of bytes
https://docs.rs/bytemuck
Apache License 2.0
697 stars 77 forks source link

the `FooBits` type generated by `#[derive(CheckedBitPattern)]` should implement `NoUninit` if `Self` is `NoUninit` #215

Open nerditation opened 9 months ago

nerditation commented 9 months ago

summary

currently, the #[derive(CheckedBitPattern)] on a struct Foo expands to code like this:

#[repr(C)]
#[derive(Clone, Copy, ::bytemuck::AnyBitPattern)]
#[cfg_attr(not(target_arch = "spirv"), derive(Debug))]
pub struct FooBits {
    field: <Field as ::bytemuck::CheckedBitPattern>::Bits,
}
unsafe impl ::bytemuck::CheckedBitPattern for Foo {
    type Bits = FooBits;
    #[inline]
    #[allow(clippy::double_comparisons)]
    fn is_valid_bit_pattern(bits: &FooBits) -> bool {
        <Field as ::bytemuck::CheckedBitPattern>::is_valid_bit_pattern(&{ bits.field})
            && true
    }
}

however, the generated FooBits type cannot be used as a buffer which will be later read into, because bytes_of_mut requires the type to be NoUninit + AnyBitPattern, while FooBits doesn't implement NoUninit, even if Foo is NoUninit.

I suggest the derive macro for CheckedBitPattern to add an implmentation of NoUninit for the generated "Bits" type, something like:

unsafe impl NoUninit for FooBits where Foo: NoUninit {}

context

I was using CheckedBitPattern as a validator for very simple binary file format. I could write something like this:

    let mut file = File::open("/path/to/file")?;
    let mut buffer = [0u8; mem::size_of::<Header>()];
    file.read_exact(&mut buffer))?;
    let header: Header = try_cast(buffer)?;

but I would like to use the HeaderBits as buffer, like this:

    let mut file = File::open("/path/to/file")?;
    let mut header = HeaderBits::zeroed();
    file.read_exact(bytes_of_mut(&mut header))?;
    let header: Header = try_cast(header)?;

but the code won't compile, unless I manually add:

unsafe impl NoUninit for HeaderBits {}

my Header type looks like this:

#[derive(Debug, Clone, Copy, NoUninit)]
#[repr(transparent)]
pub struct Magic([u8; 4]);
unsafe impl CheckedBitPattern for Magic {
    type Bits = [u8; 4];
    fn is_valid_bit_pattern(bits: &Self::Bits) -> bool {
        bits == b"\x27smp"
    }
}
#[derive(Debug, Clone, Copy, NoUninit)]
#[repr(C, packed)]
pub struct Version {
    pub major: u8,
    pub minor: u8,
}
unsafe impl CheckedBitPattern for Version {
    type Bits = [u8; 2];
    fn is_valid_bit_pattern(bits: &Self::Bits) -> bool {
        let &[major, minor] = bits;
        major == 1 || (major == 2 && minor <= 2)
    }
}
#[derive(Debug, Clone, Copy, NoUninit, CheckedBitPattern)]
#[repr(C, packed)]
pub struct Header {
    pub magic: Magic,
    pub version: Version,
    pub flags: u16,
}

btw, it would also be convenient if the CheckedBitPattern trait could have a conversion function, e.g. try_from_bits(), something like:

pub unsafe trait CheckedBitPattern: Copy {
    type Bits: AnyBitPattern;
    // Required method
    fn is_valid_bit_pattern(bits: &Self::Bits) -> bool;
    // Provided method
    fn try_from_bits(bits: Self::Bits) -> Result<Self, CheckedCastError> {
        try_cast(bits)
    }
}

or, alternatively, we can add an try_into_checked() method for the generated "Bits" type.

Lokathor commented 9 months ago

@fu5ha this seems like something for you since you're the proc-macro guardian

fu5ha commented 9 months ago

👍 agree with the analysis, feel free to PR and I will review or ping me in a couple weeks and I can hopefully implement it

nerditation commented 9 months ago

I tried to take a look at the code. I find it more complicated then expected. the main issue is, derive macros don't receive the derive attribute itself. for example, given the definition:

#[derive(Foo, Bar)]
struct MyStruct {}

the procedural macro of Foo cannot see Bar is present; same for Bar.

in this case, CheckedBitPattern cannot determine whether the type derives NoUninit or not. I can think of two possible solutions here:

  1. let the user annotate with a helper attribute for the CheckedBitPattern derive macro, something like:

    #[derive(Clone, Copy, NoUninit, CheckedBitPattern)]
    #[checked_bits_type(additional_traits(NoUninit))]
    struct MyStruct {}
    • pros: can derive other traits for the generated Bits type, not limited to NoUninit;
    • cons: traits like NoUninit need to be repeated
  2. use the unstable #![allow_trivial_constraints] feature and put the implementation behind a feature gate.

    the implementation is just as simple as:

    unsafe impl NoUninit for FooBits where Foo: NoUninit {}
    • pros: easy to implement, also covers the case when the type doesn't #derive but manually unsafe impl NoUninit.
    • cons: only works on nightly, and rfc2056 isn't likely to be stablized in the near future.

there's also another "non-solution": in document, ask the user to write multiple #[derive()] separately, and in specific order. for example, from my testing, in the following code:

#[derive(Foo)]
#[derive(Bar)]
struct MyStruct {}

the input token stream of Foo contains the attribute #[derive(Bar)], but not the other way around. so if we write #[derive(NoUninit)] after #[derive(CheckedBitPattern)], the derive macro of CheckedBitPattern will see the NoUninit derive attribute.

but this is a "non-solution", not only because it is delicate and feels "hacky", but most importantly, rustfmt will merge separate #[derive()] attribtes into one single attribute, so it doesn't actually work at all.

at this point, I feel like it's probably just not worth it to add this feature.

anyway, I'm not an expert on procedural macros, maybe I'm missing something obvious. @fu5ha any opinions or suggestions? thanks.