bchavez / Bogus

:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
Other
8.62k stars 491 forks source link

Use existing value to "seed" another. #456

Open MelGrubb opened 1 year ago

MelGrubb commented 1 year ago

Please describe why you are requesting a feature

Apologies if this exists already, but I'm looking for a way to use Bogus to anonymize data. I would like to anonymize production data for use in a QA environment, but it would be nice if the data came out the same way each time. In other words, I would like to use "Bob Smith" as input and have it come out as "Fred Jones" each time. This is only an example, I don't literally mean those specific names, but it would be helpful for QA if the anonymized data were stable so that when we refresh the data, the example person they were looking at last week still has the same name.

tl;dr - I would like a way to pass a "seed" value to individual rules to ensure that the same random value is generated each time, based on an input value so that, for example, using "Bob" as the seed value always results in "Fred" being generated.

Please provide a code example of what you are trying to achieve

Something like this:

var testUsers = new Faker<User>()
    .RuleFor(u => u.FirstName, (f, u) => f.Name.FirstName(u.Gender, seed = {Some string value}))

Ideally, "Some value" would be automatically derived from an input value based on the real-world data, such as the existing record's FirstName property.

Please answer any or all of the questions below

If the feature request is approved, would you be willing to submit a PR?

No I wish I had the time, but I don't. Maybe if I get to retire from the day job someday.

Crossbow78 commented 1 year ago

Other use-cases could be:

Generally speaking, how would we teach Bogus about (basic) dependencies between properties in our data models?

I could imagine a syntax like this:

.RuleFor(x => x.StartDate, f => f.Date.Past())
.RuleFor(x => x.EndDate, f => f.Date.Future(relativeToProperty: x => x.StartDate).OrNull(f))  // Suggested syntax, not working

or:

.RuleFor(x => x.IsCancelled, f => f.Random.Bool())
.RuleFor(x => x.CancellationReason, f => f.Random.Words().OrNullWhen(x => !x.IsCancelled, f)  // Suggested syntax, not working
Pigna commented 1 year ago

I am using Bogus already to do something you are doing,

I added the following code and used the ID of the data from the database to fill the seed.

var faker = new Faker
{
    Random = new Randomizer(seed)
};
MelGrubb commented 1 year ago

That was addressed in my second bullet point. Seeding the randomizer based on the Id, or a hash of the Id works great until you add new properties to the object. To ensure that each output property is stable, you'd have to re-seed the randomizer for each and every individual field, which would be very cumbersome. I'm specifically looking for per-field seeding based on an input value so that the output is random, but stable for each input value.