AutoFakerContext fixed after first Generate call

jacob7395 commented 2 months ago

I've discovered an edge case where rule sets are being ignored or overridden when the rulesets change in subsequent generation calls for the same AutoFixture instance, resulting in unexpected outputs. The following example test cases replicate the issue:

public sealed class ExampleClass
{
    public string? Value { get; init; }
    public string AlwaysSet { get; init; }
}

[Fact]
public void Generate_RuleSet_Should_Generate_With_Custom_Faker()
{
    var faker = new ExampleClassFaker();

    var noRuleSets = faker.Generate();
    var justSetNull = faker.Generate("setnull");
    var withSetNull = faker.Generate("setnull,default");

    noRuleSets.AlwaysSet.Should().NotBeEmpty();
    noRuleSets.Value.Should().NotBeEmpty();

    justSetNull.AlwaysSet.Should().BeNull(); // <-- This is why I use both the default and setnull rulesets
    justSetNull.Value.Should().BeNull();

    withSetNull.AlwaysSet.Should().NotBeEmpty();
    withSetNull.Value.Should().BeNull();
}

public class StringOverride : AutoFakerOverride<string>
{
    public override void Generate(AutoFakerOverrideContext context)
    {
        context.Instance = BuildStringWithPrefix(context.GenerateName);
    }

    public static string BuildStringWithPrefix(string prefix) =>
        $"{prefix}-{Guid.NewGuid()}";
}

public sealed class ExampleClassFaker : AutoFaker<ExampleClass>
{
    public ExampleClassFaker()
    {
        Config.Overrides ??= new List<AutoFakerGeneratorOverride>();
        Config.Overrides.Add(new StringOverride());

        RuleSet("setnull", set => set.RuleFor(property => property.Value, () => null));
    }
}

This issue still occurs even when using a non-custom class, as shown below:

[Fact]
public void Generate_RuleSet_Should_Generate()
{
    var faker = new AutoFaker<ExampleClass>();

    faker.Config.Overrides ??= new List<AutoFakerGeneratorOverride>();
    faker.Config.Overrides.Add(new StringOverride());

    faker.RuleSet("setnull", set => set.RuleFor(property => property.Value, () => null));

    var noRuleSets = faker.Generate();
    var setNull = faker.Generate("setnull");
    var setNullAndDefault = faker.Generate("setnull,default");

    noRuleSets.AlwaysSet.Should().NotBeEmpty();
    noRuleSets.Value.Should().NotBeEmpty();

    setNull.AlwaysSet.Should().BeNull(); // <-- This is why I use both the default and setnull rulesets
    setNull.Value.Should().BeNull();

    setNullAndDefault.AlwaysSet.Should().NotBeEmpty();
    setNullAndDefault.Value.Should().BeNull();
}

Problem Description

The issue when Generate() is called with multiple times with different rulesets. After investigating the code, I identified that the problem lies in how the context is initialized and used to bind callbacks for the FinalizeActions and CreateActions. The issue arises because these actions are instantiated during the initial Generate call and reference the context created at that time.

When a subsequent Generate call is made with a different ruleset, the FinalizeActions and CreateActions are not updated, so they still reference the now outdated context. In this example, it results in the variable setNullAndDefault.Value being overwritten by my StringOverride. Although this is a specific issue, it could also lead to the FinalizeActions and CreateActions for specific rulesets being ignored.

In particular, the method AutoFaker{T}.PrepareFinish.GetRuleSetsMemberNames fails to return the Value member as it's not part of the default ruleset. This results in the property being set even though my ruleset is supposed to handle it.

Possible Solutions

I see three possible solutions:

Don't skip the initialization in PrepareFinish even if it's already initialized. The main challenge here is handling the already present FinalizeActions. However, since there can only be one FinishWith per ruleset, this approach might work. The key is ensuring that the pre-existing FinishWith is properly captured before being overwritten.
Store the context as a variable in the class, and then overwrite it when Generate() is called. This approach would allow FinishWith to retrieve the correct context.

The major issue with both solutions is that they are not thread-safe. If thread safety is required, a more complex approach might be necessary. One idea is to create a generation GUID and store this on the context. This GUID could then be used in the callbacks for FinishWith and CreateActions. By storing the context associated with that GUID in a thread-safe collection (or by locking the collection), we can ensure the correct context is used. However, I’m unsure how to pass the generation GUID to the callbacks, as I don't fully understand the underlying faker library.

It might be necessary to resort to either option 1 or 2 and potentially lock the entire Generate method. While I don't favor that idea, it could be an option in AutoFakerConfig to disable locking, with documentation explaining that this issue may occur in that case.

I tested the second option, and my tests passed. All other tests in the project also still pass, but this might be because we don't have a test that checks parallel calls where the context differs for each call.

I'm unsure what the best option is to fix this, or if it's even worth fixing versus just documenting the issue. However, it does behave differently from the base Bogus library, so perhaps it should be addressed. I feel like we are fighting with a limitation of Bogues and may need to put a PR to pass the current rulesets to the callbacks for FinishWith and CreateActions, then we can just generate the context when that request is made. However that is a non trivial potential breaking change so that probably wont happen. Could do something with the faker context maybe, tracking the rulesets used for a specific context then use that in the callbacks to identify the correct AutoFakerContext...

@soenneker got any ideas for fixing this?

PS: Thanks for merging my previous PR! Now I can remove my workaround in my own project.

jacob7395 commented 2 months ago

Don't like the idea but could carte a AutoFakerHub that inherits from Faker, then overwrite the FakerHub property of Faker<T> we could then add some extra context to this class that tracks the generated instance somehow mapping that to a AutoFakerContext. I still think there would be issues with concurrency, and just don't know what ruleset to use just from the Faker class passed into the call-backs even if we have overwritten it. Maybe the GlobalUniqueIndex could be used but not sure yet.

Feels like the write direction but not there yet!

soenneker commented 1 month ago

Thanks for putting this together.

It doesn't surprise me there's stuff like this happening..

One goal of this lib is to be fast, but I also do want to solve this because I see no reason why this shouldn't be valid usage. I also want to maintain the ability for tests to run concurrently so thread safety is going to be important.

I have some initial thoughts but none worth sharing at this time. Let me get back to you.

jacob7395 commented 1 month ago

@soenneker hey, I would like to see this issue and #263. This isn't my highest priority but would be able to help with design/implementation if you want to bounce ideas I will try to put some code together. Or just be someone you can talk though some design with, it helps me writing/talking though stuff sometimes :)

Either way let know how I can help.

soenneker / soenneker.utils.autobogus

AutoFakerContext fixed after first Generate call #261

Problem Description

Possible Solutions