dermesser / integer-encoding-rs

Integer encoding for primitive integer types: Supports varint/varint+zigzag and fixed-length integer encoding and decoding, and provides synchronous and asynchronous Write/Read types for easily writing/reading integers.
Other
66 stars 16 forks source link

Consider sealing `FixedInt` #29

Open jorgecarleitao opened 2 years ago

jorgecarleitao commented 2 years ago

I was reviewing why we could not use forbid(unsafe_code) as part of #28, and I think that we can trigger UB in safe Rust by an incorrect implementation of FixedInt:

#[test]
fn test_switch() {
    impl FixedInt for i128 {
        // the invariant Self <-> [u8; size_of<Self>] is violated
        const REQUIRED_SPACE: usize = 256;
        fn required_space() -> usize {
            todo!()
        }

        fn encode_fixed(self, dst: &mut [u8]) {
            todo!()
        }

        fn decode_fixed(src: &[u8]) -> Self {
            todo!()
        }

        fn encode_fixed_light<'a>(&'a self) -> &'a [u8] {
            todo!()
        }
    }

    let int = -32767i128;
    let int = int.switch_endianness();  // 💥
}

This case is specific to an invalid REQUIRED_SPACE and we can fix it assert_eq!(size_of::<Self>(), Self::REQUIRED_SPACE); on switch_endianness, but the notion more general - AFAIK the transmutes happening here are only valid "plain old data" types (bytesmuck::Pod).

A more exotic example is #[derive(Clone, Copy, Debug)] struct A<'a>(&'a [u8]); we allow impl FixedInt for A<'_> (even a correct REQUIRED_SPACE of 16 results in UB).

AFAIK we fundamentally can't have the current implementation of switch_endianness without assuming that the implementer fulfills the plain old data invariant.

I can't think of a way of mitigating this without breaking backward compatibility. Specifically, FixedInt must require more invariants than it currently requires, which is a backward incompatible change.

Assuming that we must break backward compatibility and none of the public APIs expects users to implement FixedInt, my preference is to seal the trait FixedInt - we only use to expose functions to some fundamental types (arguably because Rust std does not have traits for to_le_bytes, to_be_bytes, etc.).

This would give us the full flexibility to use it within the constraints we have specifically in the crate (the types that implement the trait). I am confident that it would allow us to remove all unsafe in the crate.

Alternatively, we could use FixedInt to FixedInt: bytesmuck::Pod (which introduces a new dependency :/)

EDIT: encode_fixed_light has an equivalent problem - transmute of structs to byte slices is only valid for bytesmuck::Zeroable structs - Rust allows uninitialized padding bytes, which transmuting to &[u8] results in UB (&[u8] assumes initialized bytes).

dermesser commented 2 years ago

I see the issue, and part of it might already be fixed with this: At least the specific demo problem with i128/switch_endianness():

diff --git a/src/fixed.rs b/src/fixed.rs
index 8866c9e..76afae1 100644
--- a/src/fixed.rs
+++ b/src/fixed.rs
@@ -34,7 +34,7 @@ pub trait FixedInt: Sized + Copy {
     fn switch_endianness(self) -> Self {
         // Switch to intrinsic bswap when out of nightly.
         unsafe {
-            let sl = std::slice::from_raw_parts_mut(transmute::<&Self, *mut u8>(&self), Self::REQUIRED_SPACE);
+            let sl = std::slice::from_raw_parts_mut(transmute::<&Self, *mut u8>(&self), size_of::<Self>());
             sl.reverse();
             *transmute::<*const u8, &Self>(sl.as_ptr())
         }

IIRC I introduced the associated constant REQUIRED_SPACE back when size_of() wasn't a const function yet, to (preliminarily) save some time. In theory this might be replaced with size_of() calls in every place.

I agree that sealing the trait might be a good idea anyway. Do you know specifically if that breaks the backwards compatibility with clients using the trait in its intended form? I.e. not implementing the trait for their own types.

jorgecarleitao commented 2 years ago

I am playing around with the crate - #30 is one option to address this

Yes, not being able to implement the trait is the backward incompatible change if we seal it.