Open kupiakos opened 5 months ago
See also: #590
@jswrenn and I met and discussed potential designs. He has his own design proposal that he'll share at some point. Here's mine.
This design is based on the observation that some users may want to not only validate safety conditions, but actively transform their type - for example to construct a new witness wrapper type. It permits a distinction between the type being constructed and its "raw" equivalent.
This design is also explicitly meant to support custom length fields (#1289). In order to do that, validation can return a rich error message which might be required when a length field cannot be parsed.
We didn't have time during our discussion to think deeply about how this composes with a #[length]
attribute. For example, is the length extracted before or after calling the user's custom validator? Can the #[length]
attribute itself provide a "custom extractor" that also has error cases? These will need to be thought through.
// Used in the following examples
type MaybeValid<T, A> = Ptr<T, (A, Any, AsInitialized)>;
type MaybeAligned<T, A> = Ptr<T, (A, Any, Valid)>;
unsafe trait TryFromBytes {
// Set by `#[zerocopy(raw = ...)]`, defaults to `Self`.
#[doc(hidden)]
type Raw: TryFromBytes;
// Set by `#[zerocopy(error = ...)]`, defaults to
// `()` or similar.
//
// Not doc(hidden)!
type ValidationError;
// Replaces `is_bit_valid`.
fn try_from_maybe_valid_raw<A>(
maybe_raw: MaybeValid<Self::Raw, A>,
) -> Result<MaybeAligned<Self, A>, Self::ValidationError>;
}
Here's how this would be used by a hypothetical user:
#[derive(TryFromBytes)]
#[zerocopy(raw = FooRaw, error = FooError, validator = Foo::try_from_raw)]
struct Foo(...);
#[derive(TryFromBytes)]
struct FooRaw(...);
struct FooError(...);
impl Foo {
fn try_from_raw<A: Aliasing>(
r: Result<MaybeAligned<Self::Raw, A>, Self::Raw::ValidationError>
) -> Result<MaybeAligned<Self, A>, Self::ValidationError> {
...
}
}
Our derive would emit the following impl:
unsafe impl TryFromBytes {
type Raw = FooRaw;
type ValidationError = FooError;
fn try_from_maybe_valid_raw<A>(
maybe_raw: MaybeValid<FooRaw, A>,
) -> Result<MaybeAligned<Self, A>, Self::ValidationError> {
let raw_result = FooRaw::is_bit_valid(maybe_raw);
Foo::try_from_raw(raw_result)
}
}
As written, this gives the user full power: they are responsible for converting Self::Raw::ValidationError
into Self::ValidationError
and MaybeAligned<Self::Raw>
into MaybeAligned<Self>
. However, the user may not want to deal with all of these details. Thus, we can abstract somewhat and support the following simplifications:
Result
; the framework will handle propagating errorsbool
rather than a Result
; the framework will handle generating Ok
or Err
values as appropriateFirst, we introduce the following trait and implement it for different function types. Each impl (except for the most general one, which puts all of the onus on the user) carries restrictions:
Result
, we add a T::ValidationError: From<<T::Raw as TryFromBytes>::ValidationError>
bound to ensure that, if an error is encountered, the framework can convert the errorbool
, we additionally require T::ValidationError: Default
to ensure that our framework can synthesize new errors when the validator returns false
trait Validator<T: TryFromBytes, A, Disambiguator> {
fn try_from_raw(
self,
r: Result<
MaybeAligned<<T as TryFromBytes>::Raw, A>,
<<T as TryFromBytes>::Raw as TryFromBytes>::ValidationError,
>,
) -> Result<MaybeAligned<T, A>, T::ValidationError>;
}
impl<T, F, A> Validator<T, A, ()> for F
where
T: TryFromBytes,
F: FnOnce(
Result<MaybeAligned<T::Raw, A>, <T::Raw as TryFromBytes>::ValidationError>,
) -> Result<MaybeAligned<T, A>, T::ValidationError>,
{
fn try_from_raw(
self,
r: Result<MaybeAligned<T::Raw, A>, <T::Raw as TryFromBytes>::ValidationError>,
) -> Result<MaybeAligned<T, A>, T::ValidationError> {
self(r)
}
}
impl<T, F, A> Validator<T, A, ((),)> for F
where
T: TryFromBytes,
T::ValidationError: From<<T::Raw as TryFromBytes>::ValidationError>,
F: FnOnce(MaybeAligned<T::Raw, A>) -> Result<Maybe<T, A>, T::ValidationError>,
{
fn try_from_raw(
self,
r: Result<MaybeAligned<T::Raw, A>, <T::Raw as TryFromBytes>::ValidationError>,
) -> Result<MaybeAligned<T, A>, T::ValidationError> {
match r {
Ok(r) => self(r),
Err(err) => Err(err.into()),
}
}
}
impl<T, F, A> Validator<T, A, (((),),)> for F
where
T: TryFromBytes<Raw = T>,
T::ValidationError: Default + From<<T::Raw as TryFromBytes>::ValidationError>,
F: FnOnce(MaybeAligned<T::Raw, A>) -> bool,
{
fn try_from_raw(
self,
r: Result<MaybeAligned<T, A>, <T::Raw as TryFromBytes>::ValidationError>,
) -> Result<MaybeAligned<T, A>, T::ValidationError> {
match r {
Ok(r) => if self(r) {
Ok(r)
} else {
Err(Default::default())
},
Err(err) => Err(err.into()),
}
}
}
Finally, we can change the derive-generated code like so:
fn try_from_maybe_valid_raw<A>(
maybe_raw: MaybeValid<FooRaw, A>,
) -> Result<MaybeAligned<Self, A>, Self::ValidationError> {
let raw_result = FooRaw::is_bit_valid(maybe_raw);
Validator::try_from_raw(Foo::try_from_raw, raw_result)
}
Note that this also supports closures rather than named validators. For example, if the user specifies #[zerocopy(validator = |foo| foo.0.is_valid())]
, this desugars as expected:
fn try_from_maybe_valid_raw<A>(
maybe_raw: MaybeValid<FooRaw, A>,
) -> Result<MaybeAligned<Self, A>, Self::ValidationError> {
let raw_result = FooRaw::is_bit_valid(maybe_raw);
Validator::try_from_raw(|foo| foo.0.is_valid(), raw_result)
}
@djkoloski if we added validation context to this design, would it support your rkyv use case? In particular, is the ability to perform mutation in the validator enough to do your fix-up operation?
We'd at a minimum need a slight tweak: we'd have to have a way of signaling that a TryFromBytes
impl only works on mutable input in order to do the fix-up. But let's assume we've done that for the sake of this question.
These are kinds of validity that users may need to have checked before transmutation from
&[u8]
to&T
:T
, e.g. abool
must be either0
or1
. This is implemented by thederive
.T
, e.g. an invariant that the first field is less than the second. This can be referenced by thederive
but inherently must be user-controlled.T
, based on the above library-level check applied to each field. This is also implemented by thederive
.length
field is equal to the size of the tail slice. This only applies to dynamically sized structs ending in a slice.The plan discussed in #5 and #372 is to support the concept of a custom validator, a function or closure provided to
derive(TryFromBytes)
that will always be called before allowing aTryFromBytes
transmute to succeed.Open Questions
TryFromBytes
APIs?#[length]
and custom validator function and they disagree?struct Outer { x: u8, y: Inner }
/struct Inner { a: [u8; 4], b: [u8] }
, is the validator forOuter
allowed to communicate a maximum length for the tail slice located inside ofInner
? What ifInner
has a custom validator that returns "valid if the tail slice is truncated to N", butOuter
has a different return from its custom validator?bool
,Result<(), CustomError>
, orResult<Option<usize>, CustomError>
(to communicate a required length).+
overflow checking), or always reject the input (could miss bugs in validators).