bchavez / Bogus

:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
Other
8.66k stars 495 forks source link

#411 - Allow for different PRNGs (random algorithms) #412

Closed DaveRMaltby closed 2 years ago

DaveRMaltby commented 2 years ago

Provides a new IRandom interface that is a sub-set of the methods of the System.Random class.

bchavez commented 2 years ago

Thank you @DaveRMaltby for the PR; I think this PR is ready for review now. I'm pretty sure I can take over from here and carry it across the finishing line (if we move forward).

I will need some time to think about this PR to make sure it's a good fit; I have to weigh the pros/cons, cost/benefits, and value.

I know this PR might be of high value for you; but when I review this PR later, I need to get a better understanding of:

I don't know the answers to these which is mostly why I haven't committed on #411 yet.

Thanks, Brian

bchavez commented 2 years ago

Review notes:

DaveRMaltby commented 2 years ago

Brian, I agree with your assessment here. As the repo owner, I totally understand the need to weigh these aspects. I appreciate you considering my need and thank you for even considering it. I like your suggestion of having DefaultRandom inherit Random. I assumed that System.Random would be sealed, but now see that it is not.

Regards, David

bchavez commented 2 years ago

I had the same exact assumption - I originally thought System.Random was sealed too. It seemed more natural to subclass and tac on IRandom for this particular requirement; then confirmed via decompilation and was happily surprised random was not sealed. I'm not a big fan of wrapper classes :grimacing: so I was just experimenting.

But when I pursued that path; a bunch of tests broke because they assumed Randomizer.Seed<System.Random> and I changed the contract to Randomizer.Seed<IRandom>; and I was like... oh yeah... so, I know have to weigh that too... I'm okay with breaking changes generally if there's enough justification for it and it produces better clarity and maintainability.


Out of curiosity, have you tried to subclass System.Random and override method signatures defined in IRandom?

I just now checked the .NET Full Framework and .NET Core sources, and it seems that System.Next*() are marked virtual. Do you think you could get away with a custom PRNG implementation without this PR by overriding all inherited virtual members from System.Random?

DaveRMaltby commented 2 years ago

Clever thinking. I'll have to check on that. No, I haven't tried subclassing System.Random. I'll try that. My only hesitation there, is that I was trying to avoid the expense of the Random ctor. In my previous testing, the seeded ctor was as expensive as calling Next() about 1.2 million times.. but maybe I can make something work that doesn't kill performance in my situation. I'm potentially constructing thousands of PRNGs here. Thanks again.

bchavez commented 2 years ago

The following seems to work:

void Main()
{
   Randomizer.Seed = new CustomRandom();

   var r = new Randomizer();
   r.Number().Dump();
   r.Number().Dump();
   r.Number().Dump();
   r.Number().Dump();
   r.Number().Dump();
   r.Number().Dump();
}

public class CustomRandom : System.Random
{
   public override int Next()
   {
      return 1;
   }
   public override int Next(int maxValue)
   {
      return 1;
   }
   public override float NextSingle()
   {
      return 1.0f;
   }
   public override long NextInt64()
   {
      return 1;
   }
   public override void NextBytes(byte[] buffer)
   {
      buffer[0] = 1;
   }
   public override int Next(int minValue, int maxValue)
   {
      return 1;
   }
   public override long NextInt64(long maxValue)
   {
      return 1;
   }
   public override double NextDouble()
   {
      return 1.0;
   }
   public override void NextBytes(Span<byte> buffer)
   {
      base.NextBytes(buffer);
   }
   public override long NextInt64(long minValue, long maxValue)
   {
      return 1;
   }
}
Output:
1
1
1
1
1
1

The only disadvantage right now would be setting a localSeed for Randomizer by System.Random instance; which could be just an easy change as opening up a new constructor overload for new Randomizer(System.Random).

bchavez commented 2 years ago

Ah, okay, cool. Let me know what you find.

DaveRMaltby commented 2 years ago

Yes, this approach of inheriting from Random would suffice for me. I would only need the new ctor overload that you mentioned of new Randomizer(System.Random)... In my work, I'm constructing so many different Randomizer instances that the public static Random Seed would be hard to use.

DaveRMaltby commented 2 years ago

Please let me know if you'd like for me to submit a PR for just the added Randomizer ctor that takes a System.Random. Thanks!

bchavez commented 2 years ago

@DaveRMaltby I think there are two possibilities:

  1. Create a new Randomizer(System.Random) constructor.
  2. Also, we can make localSeed protected; so that it allows inheriting of Randomizer with the ability to set your own localSeed with a custom constructor; eg:
      public class CustomRandomizer : Randomizer
      {
         public CustomRandomizer(System.Random myRandom)
         {
            this.localSeed = myRandom;
         }
      }

    https://github.com/bchavez/Bogus/blob/59985ee11fb2a34b89d940d5da8b34617f6f9f6f/Source/Bogus/Randomizer.cs#L44

IMHO, I think both of the options above are better than our IRandom abstraction layer.

I'm 50/50; but slightly leaning on No.2 if that one works for you... - typically, in situations like this; I've opened up internal or private members - usually, it keeps public "surface" APIs simple and ergonomic for 99% of use cases, and for those situations that need to take the extension to the next level - the protected non-public members are usually a way to wire-in extensibility or different behavior under the hood.

If you're up for it, you can close this PR and begin a different approach; just as always, please be sure to include unit tests; I rely heavily on the unit tests as a "go/no-go/release-ready/breaking-change/software-quality" signal. Additionally, any new unit tests should be tied to the github issue that raised the change so I can easily remember the context about why that change was needed since this change was initiated by a customer/user. Eg Issue411.cs:

DaveRMaltby commented 2 years ago

@bchavez ,

2 it is. Both options work for me. I've got some important business this evening, I'll close this PR now and tomorrow create a new one with unit tests in an Issue411.cs file. Appreciate again, you working to accommodate me here!