Closed kupiakos closed 6 months ago
This will be a breaking change, so we'll need to make sure we either address it as part of 0.8 or punt it to a future release. I've added it to #671 to make sure we make a decision one way or another before we release 0.8.
Heads up about https://github.com/google/zerocopy/issues/1051, which relates to this, and about https://github.com/google/zerocopy/pull/1059, which resolves #1051 in the opposite direction from this request.
Unless we want even more method duplication (which we'd really like to avoid), it seems that we'll need to just pick one of the following two approaches:
FromBytes
methods drop the suffix/prefix, and users who want them need to use Ref
FromBytes
methods return the suffix/prefix, and users who don't need them need to explicitly discard themRef
API equivalents.@kupiakos want to make sure you're aware that this is something we're considering and give you a chance to chime in.
From a survey of uses in the ecosystem
How was this survey performed? Via usage of the current unstable FromBytes
APIs in open-source code? Did you also include usage of the exact-size APIs that are preceded by a .get(..SIZE)
?
it seems that needing the suffix/prefix is more common than not needing it
I don't doubt that keeping the suffix is a common operation, especially in networking code. Firmware, unfortunately, tends to not be public. We need something ergonomic to translate something like this C code, which is extremely prevalent, into fully safe Rust:
int foo(char *data, size_t size) {
if size < sizeof(struct Foo) {
return ERROR_TOO_SMALL;
}
// Foo is a packed struct.
struct Foo foo* = (Foo*)data;
...
}
Essentially, we want the safe version of a bounds-check followed by a pointer cast (also checking alignment if necessary). Requiring an exact size for the buffer rather than the buffer simply be large enough is rare and unnecessarily restrictive. When working with embedded code bases, it's pretty common to have a single buffer scratch space to do operations that may be larger than is necessary, or is reused as part of the response, such as with TPM.
Doing extra work and then throwing it away is antithetical to good embedded software design. Not only are methods that return more than two pointers a risk for code size bloat (even with inlining), but splitting currently introduces a panic path. The optimizer has an extremely hard job, and it is ideal to limit how much it needs to consider in order to produce optimal results, especially compared to the original C.
The mut_from_prefix
method as-is currently uses Ref::new_from_prefix
but that puts extra burden on the optimizer. I plan on rewriting it to instead start with a bytes.get(..size_of::<Self>())
. Splitting the bytes is an extra operation, and we should only do that if we actually need to do it.
I don't think we can promise anything regarding what code we write internal to zerocopy. We currently make heavy use of the optimizer so that we can have reasonable internal abstractions, and it would be a big blow to the quality and reliability of our codebase to have to start reimplementing things by hand, especially when that involves adding rather than removing instances of the unsafe
keyword.
That said, I think there's a third way. Would the following work for your use case? It uses unrelated APIs in zerocopy and provides much clearer visibility into exactly which operations are being performed. It would presumably be straightforward to wrap in a utility macro/method if desired, which would allow you to write roughly the code you currently write, but with the guarantees you're looking for.
// runtime check: do we have sufficient bytes?
let Ok(bytes) = <&[u8; size_of::<Foo>()]>::try_from(buffer) else {
panic!("wrong number of bytes");
};
// no runtime checks
let des: &Unalign<Foo> = transmute_ref!(bytes);
// runtime check: is the deserialization well-aligned?
let Some(des) = des.try_deref() else {
panic!("wrong alignment");
};
In particular:
TryFrom<&[u8]> for &[u8; N]
is in the standard librarytransmute_ref!
desugars to a single transmute
Unalign::try_deref
performs a single alignment checkClosing in favor of https://github.com/google/zerocopy/issues/1051, which was implemented in https://github.com/google/zerocopy/pull/1059
slice_from_prefix
returnsOption<(&[Self], &[u8])>
slice_from_suffix
returnsOption<(&[u8], &[Self])>
mut_slice_from_prefix
returnsOption<(&mut [Self], &mut [u8])>
mut_slice_from_suffix
returnsOption<(&mut [u8], &mut [Self])>
However, all of the other methods drop the prefix/suffix:
FromBytes::read_from_prefix
returnsOption<Self>
FromBytes::read_from_suffix
returnsOption<Self>
FromBytes::ref_from_prefix
returnsOption<&Self>
FromBytes::ref_from_suffix
returnsOption<&Self>
FromBytes::mut_from_prefix
returnsOption<&mut Self>
FromBytes::mut_from_suffix
returnsOption<&mut Self>
This is desirable: the majority of the time I drop the prefix anyways and these are convenience methods for methods in
Ref
.[ref|mut]_from_[prefix|suffix]
in their docs both point to usingRef
to preserve the prefix/suffix.