fscheck / FsCheck

Random Testing for .NET
https://fscheck.github.io/FsCheck/
BSD 3-Clause "New" or "Revised" License
1.17k stars 156 forks source link

Question: how to create dependent properties? #628

Closed egil closed 1 year ago

egil commented 1 year ago

Hi all,

I want to write a test for a method that takes a non-empty collection of positive numbers double[], and a double targetSum, and finds the shortest combination of numbers in the collection, whose sum is >= targetSum * 0.95 and <= targetSum * 1.15.

My attempt looks like this:

public static class Generators
{
    public static Arbitrary<double[]> Doubles()
    {
        return Gen.NonEmptyListOf(Arb.Generate<double>()
            .Where(x => x >= 2 && x <= 20))
            .Select(x => x.ToArray()).ToArbitrary();
    }
}

public class Tests
{
    [Property(Arbitrary = new[] { typeof(Generators) })]
    public void FindBestSessionsTest(double[] values, double targetSum)
    {
        var result = FindBestCombination(values, targetSum);

        result.Should().NotBeEmpty();
        var sum = result.Sum();
        sum.Should().BeGreaterThanOrEqualTo(targetSum * 0.95)
            .And.BeLessThanOrEqualTo(targetSum * 1.15);
    }
}

I cannot figure out how to create a generator that will create a targetSum. The strategy I think should work is to pick one or more numbers from the values input array, sum them together, and then take the resulting exact targetSum and shift it up/down within the legal range targetSum * 0.95 and targetSum * 1.15. My idea is that FsCheck should try to find a targetSum that should be possible for FindBestCombination to find but cannot.

Any suggestions would be much appreciated by an FsCheck newbie!

kurtschelfthout commented 1 year ago

First of all, sorry for posting unchecked code - my visual studio seems to be stuck in some kind of update loop. :( But hopefully I can point you in the right direction.

The strategy I think should work is to pick one or more numbers from the values input array, sum them together, and then take the resulting exact targetSum and shift it up/down within the legal range targetSum 0.95 and targetSum 1.15

You have the right idea. There's two ways I can think of to implement it.

First, you can write a generator that generates both the array and the target sum - sorta like this:

public static class Generators
{
    public static Arbitrary<(double[], double)> Doubles()
    {
        var arrayGen = Gen.NonEmptyListOf(Arb.Generate<double>()
            .Where(x => x >= 2 && x <= 20))
            .Select(x => x.ToArray());
        // generate a double between 0.95 and 1.15. Sorry for not providing a Gen.FloatBetween or something.
        // I'd use Gen.Choose(95, 115) to generate an int, and then divide by 100.
        var factorGen = ...;
        // zip 'm up
        return Gen.Zip(arrayGen, factorGen, (values, factor) => (values, values.Sum() * factor));
    }
}

Second, you can just plonk that in the property itself:

public class Tests
{
    [Property(Arbitrary = new[] { typeof(Generators) })]  // assuming you add a Generators.Factor() so 0.95 <= factor <= 1.15
    public void FindBestSessionsTest(double[] values, double factor)
    {
        var targetSum = values.Sum() * factor;
        var result = FindBestCombination(values, targetSum);

        result.Should().NotBeEmpty();
        var sum = result.Sum(x => x.Power);
        totalPower.Should().BeGreaterThanOrEqualTo(targetSum * 0.95)
            .And.BeLessThanOrEqualTo(targetSum * 1.15);
    }
}

This is perhaps simpler, but some people don't like it because now they prefer to keep preconditions in the generators.

kurtschelfthout commented 1 year ago

Oh, one more thing. As you can see I used Zip here, so your generator doesn't really need "dependent properties" unless I missed something. But, FsCheck does support that kind of thing through SelectMany. A (rather useless) example is generating an int that is smaller than another randomly generated int:

var ex = Gen.Choose(10, 20).SelectMany(i => Gen.Choose(0, i));

This, you wouldn't be able to write using Zip (try it if not immediately clear...) so SelectMany is more powerful than Zip is more powerful than Select. (in lofty functional programming parlance resp. monad > applicative > functor).

For performance reasons, it's always better to choose the least powerful operator you can get away with.

egil commented 1 year ago

Thanks @kurtschelfthout. I think I wasn't clear enough.

In the generator, I would like to pick one or more values and sum them up to create a targetSum I know for certain is possible, so NOT values.Sum().

Is there a way for the a generator to chose a subset (at least one) from the values array, sum them up, apply the factor, and then pass that to the method under test?

ploeh commented 1 year ago

This question is really interesting on at least three different levels:

Writing that up is enough work that I'd be happy to do it, but probably use it as a springboard for a few blog posts instead. Could, you, perhaps, post a minimal reproducible example?

Right now, I don't know how FindBestCombination is implemented, and totalPower only appears once... there might be other issues as well...

ploeh commented 1 year ago

I'd also be interested to know if this is 'just' an exercise, or whether there's a 'real' use case behind this. To be clear, both are fine, but how you'd eventually want to approach the problem depends on the problem you're trying to solve - 'learning how FsCheck works' is an entirely different problem than 'real business problem'. Both are valid problems, but they have wildly different 'solutions'.

egil commented 1 year ago

Hi Mark, it is a real business problem, but I am also treating it as a learning opportunity to get familiar with FsCheck. I did attempt to simplify the problem in my original post but did forget to clean up a little of the code, hence the reference to "totalPower" (which is a double). I am not sure how much I am allowed to share, but I can share what FindBestCombination is supposed to do in more detail and leave the field/business out:

  1. Given a collection of (double Score, double Power) tuples and a double targetPower
  2. FindBestCombination must return the tuples from the collection with the highest possible combined Score whose combined Power passes the predicate targetPower * 0.95 <= sum of power <= targetPower * 1.15.

For example, given:

Trying all possible combinations naively would mean O(2^n) complexity, and when the possible tuple collection can have 1000s of tuples if not more, that's not an option. So I've implemented an optimistic version that is worst-case O(n^2), and it seems like a good chance to let FsCheck that implementation.

kurtschelfthout commented 1 year ago

Is there a way for the a generator to chose a subset (at least one) from the values array, sum them up, apply the factor, and then pass that to the method under test?

Then you definitely need SelectMany! Is this closer to what you mean?

public static class Generators
{
    public static Arbitrary<(double[], double)> Doubles()
    {
        var arrayGen = Gen.NonEmptyListOf(Arb.Generate<double>()
            .Where(x => x >= 2 && x <= 20))
            .Select(x => x.ToArray());
        // generate a double between 0.95 and 1.15. Sorry for not providing a Gen.FloatBetween or something.
        // I'd use Gen.Choose(95, 115) to generate an int, and then divide by 100.
        var factorGen = ...;
        return arrayGen.SelectMany(values => 
                 Gen.Zip(Gen.SubListOf(values).Where(v => v.Count > 0), factorGen, (subValues, factor) => 
                    (values, subValues.Sum() * factor)));
    }
}

And as for my second approach, you could do that too, using Prop.ForAll:

public class Tests
{
    [Property(Arbitrary = new[] { typeof(Generators) })] 
    public void FindBestSessionsTest(double[] values)
    {
       var targetSums = Gen.Zip(Gen.SubListOf(values).Where(v => v.Count > 0), factorGen, (subValues, factor) => 
                    subValues.Sum() * factor)).ToArbitrary();
        Prop.ForAll(targetSums, targetSum => 
           {
               var result = FindBestCombination(values, targetSum);

               result.Should().NotBeEmpty();
               var sum = result.Sum(x => x.Power);
               totalPower.Should().BeGreaterThanOrEqualTo(targetSum * 0.95)
                  .And.BeLessThanOrEqualTo(targetSum * 1.15);
          }
    }
}
egil commented 1 year ago

Thanks, @kurtschelfthout. What are the pros/cons of each of the solutions you suggest here? Are they essentially doing the same thing in different ways?

ploeh commented 1 year ago

[...] it is a real business problem

That does make it bit more difficult to discuss, since (I understand) you can't share all details.

As you're likely aware, test-driven development is famous for making one re-evaluate one's API design decisions. Property-based testing tends to amplify that tendency.

In this case, I'm wondering if there's a way to model the API so that it better captures the intent. This might also make it easier to test...

For example, I understand that the input collection is assumed to be non-empty, and (perhaps?) contain only positive tuples. Do the values fall in a particular range? How about the size of the collection? Can it be arbitrarily large? Can it be a singleton collection?

One question that immediately come to my mind is that you could have a collection with only a single element (a singleton collection) and then ask for a target number that's not satisfied by that single element. In that case, as I understand your problem statement, there's no solution.

If so, there are (theoretically) infinitely many such configurations.

Thus, I'm thinking that as a general-purpose API, a function like the one you've described seems practically useless, because your chance of getting an answer out of it, if you treat it as a black box, is essentially zero.

So I'm wondering if there are more to be said about that function than just that?

(Really, what I'm looking for are properties...)

kurtschelfthout commented 1 year ago

@Egil mostly a question of preference? Sometimes (as in the first iteration) one or the other turns out much simpler. I haven’t put much deep thought in it. I think @ploeh might have a much more considered opinion or alternative.

egil commented 1 year ago

@ploeh I'll try to be more specific:

  1. Score and Power are both positive numbers.
  2. Power is between 2 and 20 currently.
  3. There could in theory be zero tuples to choose from.
  4. Normally there will be 1000+ tuples.
  5. Tuples are sorted by Score in descending order.
  6. The input data (tuples) is changed at least every 30 seconds.
  7. The input data is also changed every time the FindBestCombination has been invoked. The selected tuples are removed from the set.
  8. The change range of the input data means that it can be hard to precompute a bunch of solutions.

In production, we may not always be able to find a combination. What I am attempting with the tests is to ensure that my implementation is such that if there is a possible combination, then it is found, ideally with the best possible score, if there are more than one combination.

My current solution is based on @kurtschelfthout suggestions. I did have to drop the factor offset, which didn't add value and resulted in false positives at times.

public static class Generators
{
    public static Arbitrary<((double Score, double Power)[] Data, (double Score, double Power)[] TargetPowers)> ValidSessions()
    {
        var sessionsGen = Gen.NonEmptyListOf(Arb.Generate<(double Score, double Power)>()
            .Where(x => x.Score >= 1 && x.Score <= 100 && x.Power >= 2 && x.Power <= 20))
            .Select(x => x.OrderByDescending(x => x.Score).ToArray());

        var factorGen = Gen.Choose(95, 115).Select(x => (double)x / 100);

        return sessionsGen.SelectMany(data=>
            Gen.SubListOf(data)
                .Where(v => v.Count > 0)
                .Select(targetPowers => (data, targetPowers.ToArray())))
            .ToArbitrary();
    }
}

public class Tests
{
    [Property(Arbitrary = new[] { typeof(Generators) })]
    public void FindBestSessionsTest(((double Score, double Power)[] Data, (double Score, double Power)[] TargetPowers) input)
    {
        var targetPower = input.TargetPowers.Sum(x => x.Power);
        var minPower = targetPower * 0.95;
        var maxPower = targetPower * 1.15;

        var result = Finder.FindBestSessions(input.Data, minPower, maxPower);

        result.Should().NotBeEmpty();
        var totalPower = result.Sum(x => x.Power);
        totalPower.Should().BeGreaterThanOrEqualTo(minPower).And.BeLessThanOrEqualTo(maxPower);
    }
}
ploeh commented 1 year ago

I don't think there's anything wrong with the specific solution you've arrived at. I just find problems like these interesting because they seem to expose that there are more layers beneath the first one.

The approach that you're currently taking is the first layer, and, again, there's nothing wrong with it. Frequently, I stop there. Essentially, what you're doing is that you've identified an equivalence class and now the problem has become: How to draw inputs from that equivalence class as faithfully as possible.

Sometimes, a next level is available. This is where you identify properties that hold for the entire domain of the function. This relieves you from the often awkward arrangement of the equivalence class, so it makes the tests simpler and also tends to make them more robust to changes. My blog has an example: Property-based testing is not the same as partition testing.

A third layer then sometimes follows where you begin to realise that (part of) what remains in terms of expressing the domain of the function can be expressed as types rather than FsCheck combinators. (Trivial example: Use a (hypothetical) NonEmptyCollection<T> instead of an array.) This once again makes the tests simpler, and also the SUT safer (à la poka-yoke).

Based on what you can divulge, I can't see what opportunities exist, so we can just leave it at that.

egil commented 1 year ago

Thanks @ploeh, appreciate your insights.