Use `Index` for dependent ranges?

sunshowers commented 1 year ago

Hi there --

Thanks for writing test-strategy! As a long-time proptest user I've been using it for a bit and really love it.

One thing I noticed is that currently, dependent strategies compile down to prop_flat_map. As you likely know, prop_flat_map (monadic bind) is a bit weird because it makes shrinking quite chaotic.

Proptest does provide an important special case for dependent distributions that are generated uniformly at random: sample::Index. (In my experience this solves most of the use cases which folks use flat_map for). I'm wondering if it would make sense to optimize some cases into indexes.

For example, for the input:

#[derive(Debug, Arbitrary)]
#[arbitrary(dump)]
struct Args {
    main: u64,
    #[strategy(0..=#main))]
    dependent: u64,
}

The code that's generated uses a prop_flat_map. Instead, however, the implementation could be an automated version of:

impl Arbitrary for Args {
    type Parameters = ();
    type Strategy = BoxedStrategy<Self>;
    fn arbitrary_with(
        args: <Self as proptest::arbitrary::Arbitrary>::Parameters,
    ) -> Self::Strategy {
        (any::<u64>(), any::<proptest::sample::Index>())
            .prop_map(|main, dependent_index| {
                let dependent = dependent_index.index(main + 1);
                Self { main, dependent }
            })
            .boxed()
    }
}

I'm not sure how error-prone it might be to analyze existing expressions, so maybe we could potentially introduce a special annotation for it, e.g. #[strategy(@dependent_range(0, #main + 1))].

Would love to hear your thoughts!

frozenlib commented 1 year ago

Thanks for the suggestion.

If I wanted to use sample::Index, I would intuitively write something like the following, but the current implementation doesn't correctly determine the dependency.

#[derive(Debug, Arbitrary)]
struct Args {
    main: usize,
    #[strategy(any::<proptest::sample::Index>().prop_map(move |i| i.index(#main)))]
    dependent: usize,
}

In the case of prop_filter, using #[filter] instead of prop_filter allows the macro to correctly understand the dependencies.

In the same way, I think it would be easier to use sample::Index by allowing #[map] to be used instead of prop_map as shown below, so that the macro can understand the dependencies.

#[derive(Debug, Arbitrary)]
struct Args {
    main: usize,
    #[strategy(any::<proptest::sample::Index>())]
    #[map(move |i: proptest::sample::Index| i.index(#main))]
    dependent: usize,
}

#[derive(Debug, Arbitrary)]
struct Args {
    main: usize,
    #[map(move |i: proptest::sample::Index| i.index(#main))]
    dependent: usize,
}

#[derive(Debug, Arbitrary)]
struct Args {
    main: usize,
    #[map(dependent_range(0, #main + 1))]
    dependent: usize,
}

fn dependent_range(start: usize, end: usize) -> impl Fn(proptest::sample::Index) -> usize {
    move |i| start + i.index(end - start)
}

frozenlib commented 1 year ago

I made #[map(...)] available. (ba368b672bb66c19348c2d99772878b111336296)

sunshowers commented 1 year ago

Thank you, I love that approach!

frozenlib / test-strategy

Use `Index` for dependent ranges? #7