Lokathor / bytemuck

A crate for mucking around with piles of bytes
https://docs.rs/bytemuck
Apache License 2.0
725 stars 79 forks source link

Add generic way of converting the item type of a container #184

Open Jarcho opened 1 year ago

Jarcho commented 1 year ago

This is capable of replacing all the cast_* functions except the cast itself. No more need for cast_ref, cast_slice cast_box, cast_vec and such and any mutable variants. Also moves some errors to post-mono compiler errors rather than runtime errors.

This is not implementable with GAT's, so this has an msrv of 1.65.

Docs and testing still need to be done, as well as implementing all the containers in alloc.

fixes #133 fixes #167 fixes #136

Lokathor commented 1 year ago

CI failed on the 1.34 version. This whole deal would have to go behind a cargo feature, given the crate's MSRV. Then we can have that feature enabled only for the "stable" and later CI runs.

That said, I'm extremely intrigued by this proposal. I was initially skeptical how you'd have it handle the difference between owned containers and borrowed containers, but that seems to be captured by the pair of "CastPtr" and "CastMetadata". I'm going to link this in the #dark-arts channel of the community discord and perhaps other places to get a lot of eyes on it, a whole new 500 line unsafe mini-framework is a lot to go over for just one person.

Jarcho commented 1 year ago

Ideally CastMetadata would only need to exist for Vec<_> and CastPtr would handle slice pointers, but there's currently no safe way to get the length from them. This would remove all the duplication the comes from having to treat Container<_> and Container<[_]> differently.

There difference between owned and borrowed containers is handled by the Ctor and Container traits. They essentially abstracts away unwrapping a container to it's raw parts, and re-wrapping it with a different item type.

Ctor essentially emulates HKT's by providing a different implementation for each container. This lets us refer to just the container (e.g. Box), rather than a container with a specific item type (e.g. Box<u32>).


I'm going to work on getting Option<Container<T>> and Cow<T> and what kind of changes need to be made for those. Box<str> to Box<[u8]> would be cool to if it's possible.

zachs18 commented 1 year ago

Wow! Great concept here! My old draft PR #134 was similar but only handled private API and didn't go this in-depth, but could still perhaps be useful to look at.


I got nerd-sniped a bit, so I added Option<C>, Box/Arc/Rc<T>/<[T]>, and Vec<T> support on my fork and added some tests (Feel free to merge/cherry-pick. See below for why I added the *Ptr types to implement CastPtr).

Currently the PodCastError kinds (at least in my fork, e.g. definitely for Vec) may not be the same as those returned by the existing cast_* free functions, which needs to be checked/fixed/have tests added for.


Minor naming nit: In the rest of bytemuck, there's usually a cast_foo (that succeeds or panics) and a try_cast_foo (that returns a Result), but cast_in_place returns a Result. I'd propose renaming to try_cast_in_place and having cast_in_place succeed or panic.


Since the CastPtr trait is where the alignment and type-safety checking happens, I believe there may need to be three different implementors, e.g. SharedBorrowingPtr<T>, UniqueBorrowingPtr<T>, OwningPtr<T>:

(Edit: actually, I guess the alignment check could just be put in CastMetadata, but that would then require having separate metadata types for &[T] and Box<[T]>.)


I don't know that Cow<T> could work with this API. For T: Sized, it probably can't work because Cow::Owned owns a T directly, not behind a (castable) pointer. For Cow<[T]> it probably can't work (as-is, at least) because there's no place to put the where T: Clone bound needed to even make Cow<[T]> a well-formed type for type Create<T: 'a>.

Edit 2: Actually, perhaps the GATs could just have <T: Copy + 'a>, since all current bytemuck-castable types are Copy anyway (since NoUninit and AnyBitPattern have Copy as supertraits). I've implemented this in my latest commit in my fork.

Jarcho commented 1 year ago

Ended up rewriting this.

There's no longer a separation between casting the data and the metadata. It didn't really help with anything and added the complication of splitting and combining it in the first place. It's now been replaced with a single CastRaw trait.

The interface has been changed into a falliable/infalliable version. The infalliable version handles only the cases which can't fail and the falliable version handles all casts. This differs from how the current split works, but it's matches how From/TryFrom works and draws a more meaningful distinction between the two. A pair of traits for falliable/infalliable by value conversions was also added to match.

GAT's are no longer required, lowering the msrv to 1.57 (required for panic! in constants). We only ever need to refer to the two instances we already had as types so the lack of a type constructor can be worked around.

The following types are now supported: &[mut] T, *const/mut T, NonNull<T>, AtomicPtr<T>, Pin<Container<T>>, Option<Container<T>>, Box<T>, Rc[Weak]<T>, Arc[Weak]<T>, Cow<T> and Vec<T> plus slice versions where applicable. Casting between T and [U] and from &mut T to &T.

Jarcho commented 1 year ago

The container shared-owns the value (e.g. Rc/Arc), so no writes can occur unless the container has guaranteed that it is uniquely owned, in which case the writes cannot be observed as the source type (Note: This would be relaxing bytemuck's current bounds. see also: https://github.com/Lokathor/bytemuck/pull/132#issuecomment-1234690814 . If this relaxation is not wanted, then another CastPtr type would be necessary).

By the sounds of the tracking issue it's likely to be the case that the bound can be relaxed. Still better to wait for that to be resolved.

Jarcho commented 1 year ago

Should be the last big change. Added a RawPointer trait to normalize the differences between regular pointers and dst pointers.

This also adds str to the list of supported types. Though that is technically not defined, it must be layout compatible with [u8] for std's interface to work.

Still need tests for alloc types and compile tests for everything.

Lokathor commented 1 year ago

I'm happy to see this making interesting progress, though I'll admit I haven't looked closely at the details yet.

Jarcho commented 1 year ago

Fun thing about panic in associated constants, cargo check doesn't evaluate them. This means actually building the code is required for the errors to appear.

trybuild will need to be changed to use cargo build instead of cargo check to test for them.

Lokathor commented 1 year ago

ah.

Probably this will get worse and worse with time, and rustc will have like.. cargo check --consts eventually or something. But until then a CI change is totally fine.

Jarcho commented 1 year ago

Figured out how to make this work. An undocumented feature implementation detail of trybuild to get it to run cargo build, but it works.

The tests need to run as a separate crate in the same workspace. trybuild has a higher msrv than 1.36 and dev dependencies can't be optional.

Jarcho commented 1 year ago

Should be in a working state now. A uitest workspace member was added and ci has a change to run it only on the stable channel. Hopefully the error messages don't change between versions too much, but diagnostic messages aren't stable from release to release.

Some more changes have been done to make sure the error message points to the call to [try_]reinterpret[_inner] and not somewhere in bytemuck's code.

Lokathor commented 1 year ago

are we ready to transition into the review phase?

Jarcho commented 1 year ago

I'll try to fix the error message for using ReinterpretInner with incompatible types. It's currently a monstrosity that doesn't even point out the issue:

error[E0599]: the method `reinterpret_inner` exists for reference `&Padded`, but its trait bounds were not satisfied
  --> tests/ui/reinterpret_incompatible.rs:15:33
   |
7  | struct Padded(u16, u8);
   | -------------
   | |
   | doesn't satisfy `<_ as Container<'_>>::Class = _`
   | doesn't satisfy `Padded: ReinterpretInner<'_, _>`
   | doesn't satisfy `Padded: bytemuck::cast::Container<'_>`
   | doesn't satisfy `Padded: std::marker::Copy`
...
15 |   let _: &u32 = (&Padded(0, 0)).reinterpret_inner();
   |                                 ^^^^^^^^^^^^^^^^^ method cannot be called on `&Padded` due to unsatisfied trait bounds
   |
   = note: the following trait bounds were not satisfied:
           `<&Padded as bytemuck::cast::Container<'_>>::Class = _`
           which is required by `&Padded: ReinterpretInner<'_, _>`
           `&Padded: bytemuck::cast::Container<'_>`
           which is required by `&Padded: ReinterpretInner<'_, _>`
           `*const Padded: bytemuck::cast::RawPtr`
           which is required by `&Padded: ReinterpretInner<'_, _>`
           `Padded: std::marker::Copy`
           which is required by `&Padded: ReinterpretInner<'_, _>`
           `*const Padded: bytemuck::cast::CastRaw<_>`
           which is required by `&Padded: ReinterpretInner<'_, _>`
           `*const &Padded: bytemuck::cast::CastRaw<_>`
           which is required by `&&Padded: ReinterpretInner<'_, _>`
           `*mut &Padded: bytemuck::cast::CastRaw<_>`
           which is required by `&mut &Padded: ReinterpretInner<'_, _>`
           `<Padded as bytemuck::cast::Container<'_>>::Class = _`
           which is required by `Padded: ReinterpretInner<'_, _>`
           `Padded: bytemuck::cast::Container<'_>`
           which is required by `Padded: ReinterpretInner<'_, _>`
           `<&mut Padded as bytemuck::cast::Container<'_>>::Class = _`
           which is required by `&mut Padded: ReinterpretInner<'_, _>`
           `&mut Padded: bytemuck::cast::Container<'_>`
           which is required by `&mut Padded: ReinterpretInner<'_, _>`
           `*mut Padded: bytemuck::cast::RawPtr`
           which is required by `&mut Padded: ReinterpretInner<'_, _>`
           `Padded: std::marker::Copy`
           which is required by `&mut Padded: ReinterpretInner<'_, _>`
           `*mut Padded: bytemuck::cast::CastRaw<_>`
           which is required by `&mut Padded: ReinterpretInner<'_, _>`
note: the trait `bytemuck::cast::Container` must be implemented
  --> $WORKSPACE/src/cast.rs
   |
   | pub unsafe trait Container<'a>: Sized {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: consider annotating `Padded` with `#[derive(Clone, Copy)]`
    |
7   | #[derive(Clone, Copy)]
    |

For reference the error message for Reinterpret is both shorter and points to NoUninit not being implemented:

error[E0599]: the method `reinterpret` exists for struct `Padded`, but its trait bounds were not satisfied
  --> tests/ui/reinterpret_incompatible.rs:10:29
   |
7  | struct Padded(u16, u8);
   | -------------
   | |
   | method `reinterpret` not found for this struct
   | doesn't satisfy `Padded: NoUninit`
   | doesn't satisfy `Padded: Reinterpret<_>`
...
10 |   let _: u32 = Padded(0, 0).reinterpret();
   |                             ^^^^^^^^^^^ method cannot be called on `Padded` due to unsatisfied trait bounds
   |
   = note: the following trait bounds were not satisfied:
           `Padded: NoUninit`
           which is required by `Padded: Reinterpret<_>`
           `&Padded: NoUninit`
           which is required by `&Padded: Reinterpret<_>`
           `&mut Padded: NoUninit`
           which is required by `&mut Padded: Reinterpret<_>`
note: the trait `NoUninit` must be implemented
  --> $WORKSPACE/src/no_uninit.rs
   |
   | pub unsafe trait NoUninit: Sized + Copy + 'static {}
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Lokathor commented 1 year ago

...once again the "note" on an error message is way too much until it's anti-helpful.

But sure, if you think you can somehow adjust it to make the error messages better please do.