AnthonyLloyd / CsCheck

Random testing library for C#
Apache License 2.0
162 stars 4 forks source link

Unbiased choice with nested OneOf? #20

Closed benjamin-hodgson closed 2 months ago

benjamin-hodgson commented 3 months ago

It appears that each individual argument to Gen.OneOf is equally likely, even when the argument itself is a OneOf generator.

So in the JSON example, the odds of getting a given structure out of genJsonNode are:

Is there a clean way to "flatten" the nested OneOf generators? I suppose I could store the individual components in a list and do it manually,

static readonly Gen<string> genString = Gen.String[Gen.Char.AlphaNumeric, 2, 5];
static readonly Gen<JsonNode>[] jsonValueGens = new[]
{
    Gen.Bool.Select(x => JsonValue.Create(x)),
    Gen.Byte.Select(x => JsonValue.Create(x)),
    Gen.Char.AlphaNumeric.Select(x => JsonValue.Create(x)),
    // ...
};
static readonly Gen<JsonNode> genJsonNode = Gen.Recursive<JsonNode>((depth, genJsonNode) =>
{
    if (depth == 5)
    {
        return Gen.OneOf<JsonNode>(jsonValueGens);
    }
    var genJsonObject = Gen.Dictionary(genString, genJsonNode.Null())[0, 5].Select(d => new JsonObject(d));
    var genJsonArray = genJsonNode.Null().Array[0, 5].Select(i => new JsonArray(i));
    return Gen.OneOf(jsonValueGens.Concat(new[] { genJsonObject, genJsonArray }).ToArray());
});

but I think that's a little ugly.

Here's a failing test:

[Fact]
public void GenOneOf()
{
    var gen = Gen.OneOf(Gen.Const(1), Gen.OneOfConst(2, 3));

    var count1 = 0;
    var count2 = 0;
    var count3 = 0;
    for (var i = 0; i < 1000; i++)
    {
        switch (gen.Single())
        {
            case 1:
                count1++;
                break;
            case 2:
                count2++;
                break;
            case 3:
                count3++;
                break;
        }
    }

    // We actually get ~500 1s and 250 each of 2s and 3s
    Assert.InRange(count1, 325, 350);
    Assert.InRange(count2, 325, 350);
    Assert.InRange(count3, 325, 350);
}
AnthonyLloyd commented 3 months ago

Yes I think the best way is to give OneOf the full list if you want the values as often. You could also use Gen.Frequency to specify it explicitly (1, 1, jsonValueGens.Length).