near / borsh-rs

Rust implementation of Binary Object Representation Serializer for Hashing
https://borsh.io/
Apache License 2.0
287 stars 62 forks source link

BorshSerialize derive fails for structs with packed attribute #261

Open Fuuzetsu opened 7 months ago

Fuuzetsu commented 7 months ago

src/main.rs:

#[repr(C, packed)]
#[derive(borsh::BorshSerialize)]
struct Foo(pub u64);

fn main() {
    println!("Hello, world!");
}

Cargo.toml:

[package]
name = "borsh-repro"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
borsh = { version = "1.2.0", default-features = false, features = ["derive"] }

Trying to build this fails with:

[nix-develop]$ cargo check
    Checking borsh-repro v0.1.0 (/tmp/borsh-repro)
error[E0793]: reference to packed field is unaligned
 --> src/main.rs:2:10
  |
2 | #[derive(borsh::BorshSerialize)]
  |          ^^^^^^^^^^^^^^^^^^^^^
  |
  = note: packed structs are only aligned by one byte, and many modern architectures penalize unaligned field accesses
  = note: creating a misaligned reference is undefined behavior (even if that reference is never dereferenced)
  = help: copy the field contents to a local variable, or replace the reference with a raw pointer and use `read_unaligned`/`write_unaligned` (loads and stores via `*p` must be properly aligned even when using raw pointers)
  = note: this error originates in the derive macro `borsh::BorshSerialize` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0793`.
error: could not compile `borsh-repro` (bin "borsh-repro") due to previous error

The macro itself expands to:

// Recursive expansion of borsh::BorshSerialize macro
// ===================================================

impl borsh::ser::BorshSerialize for Foo {
    fn serialize<W: borsh::io::Write>(
        &self,
        writer: &mut W,
    ) -> ::core::result::Result<(), borsh::io::Error> {
        borsh::BorshSerialize::serialize(&self.0, writer)?;
        Ok(())
    }
}

The work-around is to write a manual implementation that copies the underlying value first and uses a reference to that, like this:

impl borsh::BorshSerialize for Foo {
    fn serialize<W: borsh::io::Write>(&self, writer: &mut W) -> borsh::io::Result<()> {
        let v = self.0;
        borsh::BorshSerialize::serialize(&v, writer)
    }
}

I'm not sure what the best way forward/workaround is, but I thought I'd open a ticket anyway.

frol commented 7 months ago

@Fuuzetsu Does serde support it? I feel it is a rare case and users may just implement an internal helper struct for serializing/deserializing (kind of like this)

Fuuzetsu commented 7 months ago

@Fuuzetsu Does serde support it? I feel it is a rare case and users may just implement an internal helper struct for serializing/deserializing (kind of like this)

Seems to work out of the box in serde for this example: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=2005b735ca5ca80559abf53f03b36475

dj8yfo commented 4 months ago

serde achieves #[repr(packed)] partially working by surrounding (all) referenced fields with brackets { ... }, which somewhat implicitly copies the field's value.

(it would be borsh::BorshSerialize::serialize(&{self.0}, writer)?; in borsh-s case

It works fine for Copy fields:

#[derive(serde::Serialize)]
#[repr(packed)]
struct Bar {
    x: u8,
    y: u64,
}

and is a compilation error for non-Copy types:

#[derive(serde::Serialize)]
#[repr(packed)]
struct Foo {
    x: RefCell<u8>,
    y: u64,
}
 1  error[E0507]: cannot move out of `self.x` which is behind a shared reference
   --> src/main.rs:12:10
    |
 12 | #[derive(serde::Serialize)]
    |          ^^^^^^^^^^^^^^^^ move occurs because `self.x` has type `RefCell<u8>`, which does not implement the `Copy` trait
    |
    = note: `#[derive(serde::Serialize)]` triggers a move because taking references to the fields of a packed struct is undefined behaviour
    = note: this error originates in the derive macro `serde::Serialize` (in Nightly builds, run with -Z macro-backtrace for more info)

imo replicating serde-s behaviour with packed structs is worse than letting a user transparently work with unaligned references.

dj8yfo commented 4 months ago

a more versatile template for derives, allowing to work with non-Copy types, might look like the following:

use core::mem;
use core::ops::Deref;
use core::ptr;
use std::rc::Rc;

#[repr(packed)]
struct Foo {
    x: Rc<u8>,
    y: u64,
}

impl borsh::ser::BorshSerialize for Foo {
    fn serialize<W: borsh::io::Write>(
        &self,
        writer: &mut W,
    ) -> ::core::result::Result<(), borsh::io::Error> {
        borsh::BorshSerialize::serialize(
            mem::ManuallyDrop::new(unsafe { ptr::read_unaligned(ptr::addr_of!(self.x)) }).deref(),
            writer,
        )?;
        borsh::BorshSerialize::serialize(
            mem::ManuallyDrop::new(unsafe { ptr::read_unaligned(ptr::addr_of!(self.y)) }).deref(),
            writer,
        )?;
        Ok(())
    }
}

fn main() {
    let foo = Foo {
        x: Rc::new(42),
        y: 182,
    };
    println!("{:?}", borsh::to_vec(&foo).unwrap());
}