Avoid constructing a specific variant of an enum when using `DefaultMutator`

teymour-aldridge commented 2 years ago

Is it possible to instruct Fuzzcheck not to use a specific variant of an enum?

e.g.

#[derive(DefaultMutator)]
pub enum X {
  // fuzzcheck should construct this one
  SimpleType(String),
  // fuzzcheck should also construct this one
  AlsoSimpleType(i32),
  // fuzzcheck should not construct this one
  DoNotConstruct(HashMap<String, Vec<(usize, i32, String)>>
}

If it isn't, I'd be happy to implement this.

loiclec commented 2 years ago

Technically, yes, but there is no good API to construct such a mutator yet. The way enum mutators are created is as follows (the information below is slightly simplified):

A mutator that can only generate one variant of the enum is created. It has the type:
```
enum XSingleVariant<M1, M2, M3> {
SimpleType(M1),
AlsoSimpleType(M2),
DoNotConstruct(M3),
}
```
If you construct a single-variant mutator with XSingleVariant::SingleType(String::default_mutator()) then it will only produce values like X::SimpleType("some_string").
Then we create an AlternationMutator<XSingleVariant<M1, M2, M3>>. AlternationMutator is a generic kind of mutator that alternates between different submutators (fun fact: it is also used in grammar-based mutators for grammars like "a|b").
```
fn create_mutator(m1: M1, m2: M2, m3: M3) {
let mutator = AlternationMutator::new(vec![XSingleVariant::SimpleType(m1), XSingleVariant::AlsoSimpleType(m2), XSingleVariant::DoNotConstruct(m3)]);
}
```
Then we wrap this mutator into a new type XMutator<M1, M2, M3> and make it public.

So the way to avoid constructing a specific variant is “simply” not to pass one of the single-variant mutators into the AlternationMutator and replace its corresponding generic type (e.g. M3 in this case) by NeverMutator. A NeverMutator is a mutator that can't be constructed and which implements Mutator<T> for any T. So in the end, we need something like this:

struct XMutator<M1, M2> {
    mutator: AlternationMutator<XSingleVariant<M2, M2, NeverMutator>>,
}
impl<M1, M2> XMutator<M1, M2> {
    fn new(m1: M1, m2: M2) -> Self {
        Self {
            mutator: AlternationMutator::new(vec![XSingleVariant::SimpleType(m1), XSingleVariant::AlsoSimpleType(m2)])
        }
    }
}

But the explanation above is simplified because the types M1, M2, M3 don't actually implement the Mutator trait. Instead, they implement TupleMutator, which is exactly the same conceptually but can mutate tuples instead of single values. This is necessary because enums can have more than one field per variant.

So it is not too complicated, but it involves work on the fuzzcheck_mutators_derive crate, which is a bit difficult to get into for someone unfamiliar with it. And I am going on holidays next week, so I am not sure when I can implement it. Let me know if you're interested in implementing it, then I can look into it and guide you through it :-) But no pressure, I can write it myself otherwise (... eventually 😅).

teymour-aldridge commented 2 years ago

I'm interesting in implementing it!

loiclec commented 2 years ago

Great, thank you! I am going to bed now but I'll send you more information tomorrow :)

teymour-aldridge commented 2 years ago

Thanks! Sleep well xD

loiclec commented 2 years ago

Good morning!

As promised, here is more information about how to modify the procedural macro to avoid certain variants :)

First, I would expect that only the fuzzcheck_mutators_derive crate should be modified. I don't think anything else needs changing (but I could be wrong). However, you may want to add a test in fuzzcheck/tests/expansions to make sure everything still works.

Below is a short explanation of the code:

decent_synquote_alternative

You probably aren't going to like this, but the macros don't use syn nor quote. Instead, it uses decent_synquote_alternative, which is a crate I wrote about a year ago. It provides a subset of the functionality of syn and quote. The key difference that you need to know is that TokenStreams aren't built with a quote! macro but instead with a ts! or join_ts! macro, like so:

let stream = ts!(
    "#[derive(Clone)]
    struct" my_data.identifier "{"
        join_ts!(my_data.fields.iter(), (field_ident, ty), {
            field_ident ":" ty
        })
    "}"
);

The grammar for the ts! macro is simply a space-separated list of things that can be converted into a TokenStream. That includes strings, other token streams, but also parsed Rust code, such as types, identifiers, attributes, where clauses, structs, enums, etc. For example:

fn print_enum(e: &Enum) -> TokenStream {
    ts!(e)
}

Arbitrary expressions also work:

fn print_enum(ident: Option<Ident>) -> TokenStream {
    ts!(
        "enum"
        if let Some(ident) = ident {
            ts!(ident)
        } else {
            ts!("SomeEnum")
        }
        "{

        }"
    )
}

join_ts! is useful to concatenate a list of token streams given by an iterator. For example, if you have a vector such as ["u8", "bool", "Vec<bool>"] and you'd like to write the following function signature:

fn foo(x1: u8, x2: bool, x3: Vec<bool>)

then you can write:

ts!("
    fn foo("
    join_ts!(data.iter().enumerate(), (idx, ty),
        ident!("x" idx) ":" ty
        , separator: ","
    )
    ")
")

The first argument is the iterator. The second argument is the same pattern you'd use in a for loop. The third argument is a space-separated list of things that can be converted to TokenStream (same as ts!). Optionally, you can add a fourth argument with separator:, to specify the strings that should separate each element.

ident! constructs an identifier. Its argument is a space-separated list of things that can be converted to strings. For example:

let id = ident!("hello" 2 "world"); // Ident("hello2world")

fuzzcheck_mutators_derive

The crate is split into four modules:

lib.rs is the starting point of the procedural macros. For example, for the make_mutator! macro, we have:

#[proc_macro]
pub fn make_mutator(item: proc_macro::TokenStream) -> proc_macro::TokenStream {
let (settings, parser) = MakeMutatorSettings::from(item.into());
derive_default_mutator_(parser, settings).into()
// Note: I know realise some of the names in this crate are terrible. 
// e.g. derive_default_mutator_ doesn't necessarily derive a DefaultMutator impl
// depending on the settings
}

This file also contains a big struct called Common. It looks like this:

struct Common {
fastrand_Rng: TokenStream,
Option: TokenStream,
Some: TokenStream,
// etc.
}

The idea behind it is to extract common token streams into one place to avoid repeating myself. I prefer when everything is written out as non-ambiguously as possible because we don't know the context in which the macros are run. But that can be cumbersome. So instead of writing:

fn foo() {
let stream = ts!("
    #[derive(std::clone::Clone)]
    struct X {
        x: std::option::Option<std::vec::Vec<bool>>
    }
");
}

we write:

fn foo(cm: &Common) {
let stream = ts!("
    #[derive(" cm.Clone ")]
    struct X {
        x: " cm.Option "<" cm.Vec "<bool>>
    }
");
}

single_variant.rs creates a SingleVariant type and writes its Mutator implementation. tuples.rs has a macro to define tuple mutators of arbitrary length and also makes mutators for structs. I don't expect any of those files to change.
Within enums.rs, you'll be interested in fn impl_default_mutator_for_enum. Its role is to define the inner mutator that is used to mutate enums and then pass it to crate::structs_and_enums::make_mutator_type_and_impl to wrap that mutator into a nicely named mutator with a good API. So for the following enum:

#[derive(DefaultMutator)]
enum X {
    Integer(#[field_mutator(U8Mutator) = { u8::default_mutator() }] u8)
    String(String, bool)
}

Then impl_default_mutator_for_enum does the following:

Look at every field and construct a FieldMutator for it. The FieldMutators for this enum are:

vec![
    vec![
        FieldMutator {
            i: 0,       // the index of the enum item
            j: Some(0), // the index of the field within the item
            field: StructField { .. } // the parsed struct field, here it corresponds to the unnamed field `u8`,
            kind: FieldMutatorKind::Prescribed( ~~U8Mutator~ as Ty~~ , Some(ts!("u8::default_mutator()")) )
            // the mutator for this field is prescribed through `field_mutator`. Its type is `U8Mutator` and it is initialised with `u8::default_mutator()`
        }
    ],
    vec![
        FieldMutator {
            i: 1,
            j: Some(0),
            field: StructField { .. }, // unnamed field `String`
            kind: FieldMutatorKind::Generic,
        },
        FieldMutator {
            i: 1,
            j: Some(1),
            field: StructField { .. }, // unnamed field `bool`
            kind: FieldMutatorKind::Generic,
        }
    ]
]

Note that to know whether a field is Generic or Prescribed, the attribute of the field needs to be manually parsed (see read_field_default_mutator_attribute), since decent_synquote_alternative doesn't parse the attribute itself (it is just an opaque TokenStream). You'll need to do something similar to parse the enum variant attribute that says the variant should not be constructed (I'll let you decide on the name 🙂).

Construct the inner mutator. For now, the inner mutator always construct something of the shape:
```
AlternationMutator<
XSingleVariant<
    Tuple1Mutator<U8Mutator>,
    Tuple2Mutator<M1_0, M1_1>
>
>
```
but we now need to know whether an item shouldn't be constructed and then replace TupleNMutator<..> with NeverMutator.

Tell how to construct the wrapper mutator. This is now done as:

pub fn new(mutator_Integer_0: U8Mutator, mutator_String_0: M1_0, mutator_String_1: M1_1) -> Self {
Self {
    mutator: AlternationMutator::new(vec![
        XSingleVariant::Integer(Tuple1Mutator::new(mutator_Integer_0)),
        XSingleVariant::String(Tuple2Mutator::new(mutator_String_0, mutator_String_1)),
    ])
}
}

but will need to be changed not to include the forbidden variants.

All this information is given through a CreateWrapperMutatorParams to crate::structs_and_enums::make_mutator_type_and_impl, which finally creates the type XMutator and implements DefaultMutator for X.

Ideally, nothing would be changed within structs_and_enums.rs, but I have just realised that we will have a problem when the forbidden variant contains a type parameter that (1) doesn't appear anywhere else, and (2) doesn't implement DefaultMutator. For example:

#[derive(DefaultMutator)]
enum SomeEnum<T> {
    A { x: bool, y: bool },
    B(u8),
    #[do_no_construct]
    C(T) 
}

Ideally we'd like to impl DefaultMutator for SomeEnum<T> regardless of what T is. But I have no way to know if T is used or not within a parsed type. The derive(DefaultMutator) implementation will write:

impl<T> DefaultMutator for SomeEnum<T> where T: DefaultMutator { .. }

There are three ways to solve this:

don't solve it! 😅
change the where clause such that it asks that the fields implement DefaultMutator instead of the generic type param and then skip the fields that are within a forbidden item. That is what I used to do before but I vaguely remember running into some problems. So we'd have, for example, where Vec<T>: DefaultMutator instead of T: DefaultMutator.
have some way to annotate what type parameter should be included in the where clause

Testing

To test the proc macro, you can write a test file in fuzzcheck/tests/expansions that uses both make_mutator! and derive(DefaultMutator) and then try to use the mutator API to see if it works. You can look at the other tests to see what it looks like. When you use make_mutator!, you can ask rust-analyzer and VSCode to expand the macro, which is super useful to know what's going wrong. Type cmd+shift+P with the cursor on make_mutator! an select Rust Analyzer: expand macro recursively. Then copy the result and paste it back into the test file. There may be some errors with lifetimes, such as &'amut instead of &'a mut, that you will need to fix manually. Also, features such as box_syntax may be used to expand vec!, and you'll need to either add #![feature(box_syntax)] or replace their uses manually with vec!. Things like panic! and unreachable! are also expanded into invalid Rust, so you'll need to fix those too. But in the end, it can all be done very quickly.

That's it, I think? I probably missed some stuff and maybe forgot to take some problems into consideration. The medium-hard part is in keeping track of which item is forbidden. The really-hard part is in properly handling DefaultMutator in all edge cases. But I think that can be ignored if it's too difficult.

Don't hesitate to reply here with any comment or question :)

loiclec commented 2 years ago

Closed via https://github.com/loiclec/fuzzcheck-rs/pull/24

Thanks again :)

loiclec / fuzzcheck-rs