google / zerocopy

https://discord.gg/MAvWH2R6zk
Apache License 2.0
1.58k stars 104 forks source link

Soundness hole in `#[derive(IntoBytes)]` on types with `#[repr(align)]` #1748

Closed joshlf closed 1 month ago

joshlf commented 1 month ago

Consider the following type:

#[derive(IntoBytes)]
#[repr(C, align(8))]
struct Foo<T> {
    t: T,
}

#[derive(IntoBytes)] emits an IntoBytes impl for Foo with a T: Unaligned bound. The reasoning is based on the repr(C) layout algorithm, but this reasoning is unsound in the presence of #[repr(align(8))], which #[derive(IntoBytes)] spuriously ignores.

In particular, Foo<u8> satisfies u8: Unaligned, but has size 8 (7 bytes of padding) in order to satisfy its alignment requirement.

We need to either ban #[repr(align(...))] in #[derive(IntoBytes)] or at least ban it when generics are present.

joshlf commented 1 month ago

I took a stab at fixing this:

fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro2::TokenStream {
    let reprs = try_or_print!(STRUCT_UNION_INTO_BYTES_CFG.validate_reprs(ast));
    let is_transparent = reprs.contains(&StructRepr::Transparent);
    let is_packed = reprs.contains(&StructRepr::Packed) || reprs.contains(&StructRepr::PackedN(1));
    let is_aligned = reprs.iter().any(|r| matches!(r, StructRepr::Align(_)));
    let num_fields = strct.fields().len();

    let (padding_check, require_unaligned_fields) = if is_transparent || is_packed {
        // No padding check needed.
        // - repr(transparent): The layout and ABI of the whole struct is the
        //   same as its only non-ZST field (meaning there's no padding outside
        //   of that field) and we require that field to be `IntoBytes` (meaning
        //   there's no padding in that field).
        // - repr(packed): Any inter-field padding bytes are removed, meaning
        //   that any padding bytes would need to come from the fields, all of
        //   which we require to be `IntoBytes` (meaning they don't have any
        //   padding).
        (None, false)
    } else if reprs.contains(&StructRepr::C) && !is_aligned && num_fields <= 1 {
        // No padding check needed. A repr(C) struct with zero or one field has
        // no padding so long as it doesn't have an explicit repr(align)
        // attribute.
        (None, false)
    } else if ast.generics.params.is_empty() {
        // Since there are no generics, we can emit a padding check. This is
        // more permissive than the next case, which requires that all field
        // types implement `Unaligned`.
        (Some(PaddingCheck::Struct), false)
    } else if !is_aligned {
        // Based on the allowed reprs, we know that this type must be repr(C) by
        // the time we get here, but the soundness of this impl relies on it, so
        // let's double-check.
        assert!(reprs.contains(&StructRepr::C));
        // We can't use a padding check since there are generic type arguments.
        // Instead, we require all field types to implement `Unaligned`. This
        // ensures that the `repr(C)` layout algorithm will not insert any
        // padding so long as the type doesn't have an explicit repr(align)
        // attribute.
        //
        // TODO(#10): Support type parameters for non-transparent, non-packed
        // structs without requiring `Unaligned`.
        (None, true)
    } else {
        assert!(is_aligned, "this branch is unreachable without a `#[repr(align)]` attribute");

        return Error::new(
            Span::call_site(),
            "`#[repr(align)]` might introduce padding; consider removing it",
        )
        .to_compile_error();
    };

Relative to fbb0f8bcd3ff5989ccd9e640427cdb014da95168, here's the diff:

 fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro2::TokenStream {
     let reprs = try_or_print!(STRUCT_UNION_INTO_BYTES_CFG.validate_reprs(ast));
     let is_transparent = reprs.contains(&StructRepr::Transparent);
-    let is_packed = reprs.contains(&StructRepr::Packed);
+    let is_packed = reprs.contains(&StructRepr::Packed) || reprs.contains(&StructRepr::PackedN(1));
+    let is_aligned = reprs.iter().any(|r| matches!(r, StructRepr::Align(_)));
     let num_fields = strct.fields().len();

     let (padding_check, require_unaligned_fields) = if is_transparent || is_packed {
@@ -806,16 +807,17 @@ fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro
         //   which we require to be `IntoBytes` (meaning they don't have any
         //   padding).
         (None, false)
-    } else if reprs.contains(&StructRepr::C) && num_fields <= 1 {
+    } else if reprs.contains(&StructRepr::C) && !is_aligned && num_fields <= 1 {
         // No padding check needed. A repr(C) struct with zero or one field has
-        // no padding.
+        // no padding so long as it doesn't have an explicit repr(align)
+        // attribute.
         (None, false)
     } else if ast.generics.params.is_empty() {
         // Since there are no generics, we can emit a padding check. This is
         // more permissive than the next case, which requires that all field
         // types implement `Unaligned`.
         (Some(PaddingCheck::Struct), false)
-    } else {
+    } else if !is_aligned {
         // Based on the allowed reprs, we know that this type must be repr(C) by
         // the time we get here, but the soundness of this impl relies on it, so
         // let's double-check.
@@ -823,11 +825,20 @@ fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro
         // We can't use a padding check since there are generic type arguments.
         // Instead, we require all field types to implement `Unaligned`. This
         // ensures that the `repr(C)` layout algorithm will not insert any
-        // padding.
+        // padding so long as the type doesn't have an explicit repr(align)
+        // attribute.
         //
         // TODO(#10): Support type parameters for non-transparent, non-packed
         // structs without requiring `Unaligned`.
         (None, true)
+    } else {
+        assert!(is_aligned, "this branch is unreachable without a `#[repr(align)]` attribute");
+
+        return Error::new(
+            Span::call_site(),
+            "`#[repr(align)]` might introduce padding; consider removing it",
+        )
+        .to_compile_error();
     };

Unfortunately, that doesn't work as written: validate_reprs strips out any align reprs before returning on the theory that "no callers care about them":

https://github.com/google/zerocopy/blob/fbb0f8bcd3ff5989ccd9e640427cdb014da95168/zerocopy-derive/src/repr.rs#L51-L54

We should probably take this opportunity to refactor our repr handling logic since many of our derives now do care about alignment more than they used to. We could also use the opportunity to get rid of this brittle special-cased logic for #[derive(Unaligned), which currently lives inside validate_reprs:

https://github.com/google/zerocopy/blob/fbb0f8bcd3ff5989ccd9e640427cdb014da95168/zerocopy-derive/src/repr.rs#L58-L67

joshlf commented 1 month ago

I've begun to draft an overhaul of repr parsing and in-memory representation in https://github.com/google/zerocopy/pull/1752. My plan is to follow up with a fix to this issue once that lands.