nickdodd79 / AutoBogus

A C# library complementing the Bogus generator by adding auto creation and population capabilities.
MIT License
431 stars 50 forks source link

DateTimeOffset population does not respect seed #46

Open flakey-bit opened 3 years ago

flakey-bit commented 3 years ago

Population of DateTimeOffset properties does not respect the seed.

Failing test case:

[Fact]
public async Task DateTimeOffsetShouldBeIdenticalWhenSeedUsed()
{
    int seed = 1;

    var faker1 = new AutoFaker<MyEntity>().UseSeed(seed);
    var entity1 = faker1.Generate();

    await Task.Delay(TimeSpan.FromSeconds(5));

    var faker2 = new AutoFaker<MyEntity>().UseSeed(seed);
    var entity2 = faker2.Generate();

    Assert.Equal(entity2.Name, entity1.Name); // Passes
    Assert.Equal(entity2.DeprecationDate, entity1.DeprecationDate); // Fails
}
Xunit.Sdk.EqualException
Assert.Equal() Failure
Expected: 2020-09-23T16:53:07.5638001+12:00
Actual:   2020-09-23T16:53:02.5600874+12:00
public class MyEntity
{
    public DateTimeOffset DeprecationDate { get; set; }
    public string Name { get; set; }
}
nickdodd79 commented 3 years ago

Hey @flakey-bit

Under the hood, the auto generate handlers create a new instances of the Bogus Faker, which is why the seed is being lost. I will look at hooking the UseSeed on the AutoFaker up so it propagates to the underling faker instance. In the meantime, a UseFakerHub configuration handler was added a few months back that allows you to create and provide your own faker instance. This will allow the seed you set to be honoured.

Nick.

nickdodd79 commented 3 years ago

Hey @flakey-bit

I had a further look into this and all seeding is proxied through to the underlying Bogus.Faker instance. If a seed is not used then all the dates generated are random. However with a seed each faker instance generates a date based on the DateTime.Now, so there are some slight variations (milliseconds) but still based on the provided seed.

I couldn't see any changes that were needed for AutoBogus so may be it is something to raise with the Bogus project.

Nick.

flakey-bit commented 3 years ago

Hi @nickdodd79,

Thanks for getting back to me. I've tested it using Bogus.Faker directly and it works just fine I have been able to reproduce the problem:

public async Task DateTimeOffsetShouldBeIdenticalWhenSeedUsed()
{
    int seed = 1;

    var faker1 = new Faker<MyEntity>().UseSeed(seed);
        faker1.RuleFor(e => e.DeprecationDate, (f, e) => f.Date.Future());
    var entity1 = faker1.Generate();

    await Task.Delay(TimeSpan.FromSeconds(5));

    var faker2 = new Faker<MyEntity>().UseSeed(seed);
        faker2.RuleFor(e => e.DeprecationDate, (f, e) => f.Date.Future());
    var entity2 = faker2.Generate();

    Assert.Equal(entity2.Name, entity1.Name); // Passes
    Console.Out.WriteLine(entity2.DeprecationDate.ToUnixTimeMilliseconds());
    Console.Out.WriteLine(entity2.DeprecationDate);
    Assert.Equal(entity2.DeprecationDate, entity1.DeprecationDate); // Fails
}

See these notes from Bogus:

Bogus can generate deterministic dates and times. However, generating deterministic dates and times requires the following:

  1. Setting up a local or global seed value.
  2. Setting up a global anchor source of time in Bogus.DataSets.Date.SystemClock.

Anyway, I found that adding the line

    Bogus.DataSets.Date.SystemClock = () => DateTime.Parse("8/8/2019 2:00 PM");

to the test makes it pass. I guess my question is, should this API be wrapped (and exposed) by AutoBogus?

nickdodd79 commented 3 years ago

Hi @flakey-bit

Thanks for the update. I can see that switching AutoFaker for Faker does make the tests pass, which is certainly interesting. It is also good to see that you have managed to replicate the issue with Faker instances.

In terms of setting up the anchor time, one consideration is that it is a static, so we can't set that on a 'per generate' basis. I can certainly add a WithSystemClock config handler to AutoFaker.Configure. It would then basically wrap the above SystemClock setting. Would that help with your use case?

Nick.

flakey-bit commented 3 years ago

Thanks for the update. I can see that switching AutoFaker for Faker does make the tests pass, which is certainly interesting

That's because (without the rule), Bogus.Faker will leave the values as DateTime.MinValue - which was why I updated the second test case to include the line (for each faker)

faker1.RuleFor(e => e.DeprecationDate, (f, e) => f.Date.Future());

I wonder if it's worth bringing this issue up w/ the maintainer of Bogus.Faker and seeing if they'd consider adding something to help deal with it better. Feels like it should be settable on the faker instance itself (falling back to the static version otherwise).

Will leave it with you to do what you feel is appropriate. Feel free to close this issue - thanks for taking the time to investigate.

nickdodd79 commented 3 years ago

Thanks @flakey-bit.

@bchavez do you have any input into this. Could we add SystemClock configuration at a FakerHub level so we can isolate and control baseline date generation?

bchavez commented 3 years ago

Hey @nickdodd79,

Sorry for the late reply! Been super busy with the day job stuff.

I'm not totally sure yet; I think it might take some re-work. I think it comes down to:


Currently, we can anchor the date/time generations in two ways:

By convention, if refDate is specified, SystemClock is ignored.

image

https://github.com/bchavez/Bogus/blob/7caf7e0db504ac4a3796587f67a4e3b78c416c7c/Source/Bogus/DataSets/Date.cs#L127-L129

I think there's a special case where new Person() only uses the SystemClock.

In summary, currently, the order of precedence for getting an anchor in time is,

Essentially, (I think) what we're proposing is changing the precedence order to:


Overall, I would say this is probably a medium-lift depending on how it goes. Seed propagation throughout Faker<T> -> FakerHub -> DataSets is kind of a complex process to get right.

Backwards compatibility might be okay.

Where possible, I'd probably also take the opportunity to re-work .UseSeed(int val) to include an optional parameter to anchor time too. ie: .UseSeed(int val, DateTime? anchor = null) because semantically, when you set any seed, you'd probably also most likely want to set an anchor in time for deterministic date/time behavior too. This would at least surface the "time anchor" issue closer to where the developer would have expected it.


Let me know your thoughts.

Thanks, Brian

flakey-bit commented 3 years ago

Where possible, I'd probably also take the opportunity to re-work .UseSeed(int val) to include an optional parameter to anchor time too. ie: .UseSeed(int val, DateTime? anchor = null) because semantically, when you set any seed, you'd probably also most likely want to set an anchor in time for deterministic date/time behavior too. This would at least surface the "time anchor" issue closer to where the developer would have expected it.

💯