Closed sander1095 closed 1 year ago
I took some time today to make this issue easier to reproduce by not requiring EF Core.
Run this code and you will see that the first item is generated equally by both faker instances, but all other instances will generate difference results:
https://dotnetfiddle.net/2vfSmr
If that link doesn't work, run this code with .NET 6 and Bogus 34.0.2
// See https://aka.ms/new-console-template for more information
using Bogus;
using System;
using System.Linq;
var itemsToGenerate = 1000;
var productFaker = new Faker<Product>()
.StrictMode(false)
.RuleFor(x => x.Name, x => x.Commerce.ProductName())
.RuleFor(x => x.CreationDate, x => x.Date.FutureOffset(refDate: new DateTimeOffset(2023, 1, 1, 1, 1, 0, TimeSpan.Zero)));
var secondProductFaker = new Faker<Product>()
.StrictMode(false)
.RuleFor(x => x.Name, x => x.Commerce.ProductName())
.RuleFor(x => x.CreationDate, x => x.Date.FutureOffset(refDate: new DateTimeOffset(2023, 1, 1, 1, 1, 0, TimeSpan.Zero)))
.RuleFor(x => x.Description, x => x.Commerce.ProductDescription());
// Expectation:
// Both fakers should generate the same name and creationDate.
// Description is only added at the end of the 2nd faker.
// According to according to Bogus's README.md this should not change existing data when using a local seed.
var firstRun = /* */productFaker.UseSeed(1338).Generate(itemsToGenerate).Select(x => new { x.Name, x.CreationDate }).ToList();
var secondRun = secondProductFaker.UseSeed(1338).Generate(itemsToGenerate).Select(x => new { x.Name, x.CreationDate }).ToList();
// Reality:
// The first item is the same, all other ones are different
var successes = 0;
var failures = 0;
for (int i = 0; i < itemsToGenerate; i++)
{
var firstRunItem = firstRun[i];
var secondRunItem = secondRun[i];
if (firstRunItem.ToString() == secondRunItem.ToString())
{
successes++;
}
else
{
failures++;
}
}
Console.WriteLine($"Successes: {successes}");
Console.WriteLine($"Failures: {failures}");
Console.ReadKey();
class Product
{
public string Name { get; set; } = null!;
public DateTimeOffset CreationDate { get; set; }
public string Description { get; set; } = null!;
public override string ToString() => $"{Name}, {CreationDate}, {Description}";
}
Funnily enough, if you were to change secondProductFaker
like this:
var secondProductFaker = productFaker.RuleFor(x => x.Description, x => x.Commerce.ProductDescription());
We end up with 1000 successes.
This is sadly not good enough for me, because as you can see in my bug report, I use this with EF Core between runs, so I can't re-use an existing Faker instance.
Hi @sander1095,
Thank you for your post. The main problem you're running into is that you're only applying the .UseSeed(1338)
once for the entire generation.
The way you should think about it is "setting the seed value for the entire row". To illustrate:
And in other words, consider the following code in LinqPad 7:
void Main()
{
var itemsToGenerate = 10;
var productFaker = new Faker<Product>()
.StrictMode(false)
.RuleFor(x => x.Name, x => x.Commerce.ProductName())
.RuleFor(x => x.CreationDate, x => x.Date.FutureOffset(refDate: new DateTimeOffset(2023, 1, 1, 1, 1, 0, TimeSpan.Zero)));
var secondProductFaker = new Faker<Product>()
.StrictMode(false)
.RuleFor(x => x.Name, x => x.Commerce.ProductName())
.RuleFor(x => x.CreationDate, x => x.Date.FutureOffset(refDate: new DateTimeOffset(2023, 1, 1, 1, 1, 0, TimeSpan.Zero)))
.RuleFor(x => x.Description, x => x.Commerce.ProductDescription());
// Expectation:
// Both fakers should generate the same name and creationDate.
// Description is only added at the end of the 2nd faker.
T SeedRow<T>(Faker<T> faker, int rowId) where T : class
{
var recordRow = faker.UseSeed(rowId).Generate();
return recordRow;
};
var firstRun = Enumerable.Range(1, itemsToGenerate)
.Select(i => SeedRow(productFaker, i))
.Select(x => new {x.Name, x.CreationDate})
.ToList();
var secondRun = Enumerable.Range(1, itemsToGenerate)
.Select(i => SeedRow(secondProductFaker, i))
.Select(x => new { x.Name, x.CreationDate })
.ToList();
// Reality:
// The first run is the same as the second run
var successes = 0;
var failures = 0;
for (int i = 0; i < itemsToGenerate; i++)
{
var firstRunItem = firstRun[i];
var secondRunItem = secondRun[i];
if (firstRunItem.ToString() == secondRunItem.ToString())
{
successes++;
}
else
{
failures++;
Util.Dif(firstRunItem, secondRunItem).Dump();
}
}
Console.WriteLine($"Successes: {successes}");
Console.WriteLine($"Failures: {failures}");
}
class Product
{
public string Name { get; set; } = null!;
public DateTimeOffset CreationDate { get; set; }
public string Description { get; set; } = null!;
public override string ToString() => $"{Name}, {CreationDate}, {Description}";
}
Also, thank you for creating a simplified example. That always helps a lot.
Hope this helps. I'm going to close this issue now, but feel free to carry on the conversation if you have any more questions or feedback and I'll try to help!
Thanks, Brian Chavez
Version Information
What locale are you using with Bogus?
en
What is the expected behavior?
The new
Description
property should get new values and existing properties should keep the same values when generating a list of fake data. I expect this behaviour when I am following the Readme about using local seed data and determinism to have existing properties keep the same data while generating new data for new properties:What is the actual behavior?
The data gets re-generated with values for the new
Description
property. The first item keeps the same values for itsName
andCreationDate
properties (which is good and expected 👍) but the other generated fake data generates new data for theName
andCreationDate
properties, even though I follow each step of the Readme precisely. See the screenshot below for a more visualized example of what I mean.Please provide a stack trace.
There is none because there isn't an exception.
Any possible solutions?
None that I can think of.
How do you reproduce the issue?
Check out my specific branch here: https://github.com/sander1095/bogus-efcore/tree/bogus-determinism-bug-report
In commit
f49fa33502cf11a33c81b97cc354ebea4705e4d2
I added the seed data migration which generated lots of data for myProduct
class and other classes.In the next commit
6ad0b8c39072118d56bf2020ecb6a9aa29972be5
I added theDescription
column to myProduct
class, EF Core configuration and the Product Faker configuration as the final line:https://github.com/sander1095/bogus-efcore/blob/bogus-determinism-bug-report/BogusContext.cs#LL37C8-L37C8
In the final commit
8c4abf80398cbee807fc52b28b58f15a72b3b76b
I simply randotnet ef migrations add AddDescriptionColumn
to update existing data, expecting only the newDescription
property to be updated with new values.But as you can see, it doesn't only insert new values for the
Description
property, it also updates the existingName
andCreationDate
properties, but only for the 2nd until the last item, which is quite odd. The first item is done correctly!Reproducing locally
To reproduce the results of the migration locally, do the following:
appsettings.json
works for you. Currently it's configured for SQL Server, but if you don't (want) to use SQL server, you can easily switch to sqlite or other providers. The Bogus EF Core example uses sqlite, so you can install those nuget packages and change the connectionstring inappsettings.json
to whatever you like.git checkout f49fa33502cf11a33c81b97cc354ebea4705e4d2
dotnet ef migrations add AddDescriptionColumn
in the root of the project. Or you can useAdd-Migration AddDescriptionColumn
in the Package Manager console in Visual Studio.dotnet ef
if you haven't done so usingdotnet tool install --global dotnet-ef
Do you have a unit test that can demonstrate the bug?
No, the code, links and images above should be sufficient to demonstrate
Can you identify the location in Bogus' source code where the problem exists?
No.
If the bug is confirmed, would you be willing to submit a PR?
Yes