Closed qubitz closed 2 years ago
Bogus does not guarantee uniqueness. You'll have to figure out what uniqueness means to you and your application based on your specific needs. There are fundamental mathematical limits and characteristics of pseudo-random number generators that make uniqueness almost impossible to attain without some kind of repetition. Also, you can find more information here: https://github.com/bchavez/Bogus/issues/251#issuecomment-526354033
Your only real practical solution in O(n)
runtime and space is:
void Main()
{
var f = new Faker(){ Random = new Randomizer(1177614182) };
Enumerable.Range(0, 7).Select(_ => f.Lorem.GetUniqueWord()).Dump();
}
public static class ExtensionsForBogus
{
private static ulong UniqueWordCounter = 0;
public static string GetUniqueWord(this Lorem dataset)
{
return $"{dataset.Word()}{UniqueWordCounter++}";
}
}
I only searched issues for "duplicates" not "unique" 🤦♂️. Thanks for the blazing fast response and references.
Version Information
What locale are you using with Bogus?
The default locale. I've never specified the locale.
What is the expected behavior?
new Bogus.DatasSets.Lorem.Words(15)
to not produce duplicates. That being said, I didn't see any guarantees in the documentation and this is most likely an assumption.What is the actual behavior?
I am seeing duplicate words produced, sometimes even adjacent to one another. For example, the seed of
1177614182
produces the sequenceHow do you reproduce the issue?
So far I found two seeds that produce duplicates within the first 15 elements:
1177614182
and1283823404
. I'm sure there's more, these are just some I stumbled upon.Do you have a unit test that can demonstrate the bug?
If the bug is confirmed, would you be willing to submit a PR?
Yeah sure, but I would need be pointed in the right direction