josdejong / mathjs

An extensive math library for JavaScript and Node.js
https://mathjs.org
Apache License 2.0
14.41k stars 1.24k forks source link

Support for normal random distribution #472

Closed levithomason closed 2 years ago

levithomason commented 9 years ago

It looks like distribution() was removed for rethinking. However, it still exists in the docs.

Curious, is there a work around for doing random number generation with normal distribution without this method? Thanks much.

pawsong commented 9 years ago

+1

FSMaxB commented 9 years ago

This is indeed quite strange. It's still in the code, has been in parts ported over to v2's new infrastructure and there exist unit tests for it, but it isn't wired up.

Also it is used with uniform distribution in random and randomInt

In my opinion it's a bug in one of two ways:

@josdejong: Any thoughts?

josdejong commented 9 years ago

Sorry for the confusion, it shouldn't have been in the docs.

The code of distribution itself can use some love, and I removed it for v2.0 because I wanted to rethink the function (and the breaking v2 release was the only opportunity to change it one way or another).

koosvanderkolk commented 7 years ago

Any news on this front?

koosvanderkolk commented 7 years ago

My $0.02:

/**
 * Returns a random number with a Normal distribution
 * @param {float} min
 * @param {float} max
 * @param {int} precision Optional, will increase the number of iterations.
 **/
function normalRandom(min, max, precision) {
  min = min ? min : 0;
  max = max ? max : 1;
  precision = precision ? precision : 6;
  var rand = 0;

  for (var i = 0; i < precision; i += 1) {
    rand += Math.random();
  }

  return (rand / precision)*(max-min)+min;
}

math.import({
  "normalRandom": normalRandom
});

Credits for the function go to http://stackoverflow.com/questions/25582882/javascript-math-random-normal-distribution-gaussian-bell-curve/39187274#39187274

ericman314 commented 7 years ago

Personally, I prefer the Box-Muller transform. It is more exact, and produces two independent normally distributed variables at a time, There is a good implementation in Numerical Recipes which avoids the evaluation of cosine and sine, and caches the second calculated variable for the next time the function is called. Really quite neat!

koosvanderkolk commented 7 years ago

Thanks!

Please do forgive me my ignorance, but what do you mean by Numerical Recipes? http://numerical.recipes?

The same StackOverflow page gave me this:

function normalRandom() {
    var u = 1 - Math.random(); // Subtraction to flip [0, 1) to (0, 1].
    var v = 1 - Math.random();
    return Math.sqrt( -2.0 * Math.log( u ) ) * Math.cos( 2.0 * Math.PI * v );
}

I don't see how anything could be cached in this function.

ericman314 commented 7 years ago

Numerical Recipes is a collection of highly optimized algorithms for C and Fortran, but I just realized you need a license to use the programs directly. I think our best option would be to refer to this article on Wikipedia, which gives the polar form of the Box-Muller transform (which is also the form used in numerical recipes): https://en.wikipedia.org/wiki/Marsaglia_polar_method

The code sample from StackOverflow, normalRandom(), uses the original form of the Box-Muller transform, which requires a cos (and a sin, if you want the second normal variable). Some time later, a polar form was invented that doesn't require the cos or sin.

Essentially, you choose a random point in the unit circle (by choosing a point in a square and rejecting it if it is outside the circle). Then, by some clever transformation, the x and y values of that point are converted into two normally distributed random variables. You can use caching with either the original form or the polar form, since both forms produce two variables that are independent, or not related upon each other.

For reference, here is the code sample from Wikipedia:

double generateGaussianNoise(const double& mean, const double &stdDev)
{
  static bool hasSpare = false;
  static double spare;

  if(hasSpare)
  {
    hasSpare = false;
    return mean + stdDev * spare;
  }

  hasSpare = true;
  static double u, v, s;
  do
  {
    u = (rand() / ((double) RAND_MAX)) * 2.0 - 1.0;
    v = (rand() / ((double) RAND_MAX)) * 2.0 - 1.0;
    s = u * u + v * v;
  }
  while( (s >= 1.0) || (s == 0.0) );

  s = sqrt(-2.0 * log(s) / s);
  spare = v * s;
  return mean + stdDev * u * s;
}
koosvanderkolk commented 7 years ago

Thanks again. What about this?

/**
 * Returns a random number with a Normal distribution
 * @param {float} min The minimum value (default: 0)
 * @param {float} max The maximum value (default: 1)
 * @param {float} stdDev The standard deviation (default: 10% of max-min).
 **/
function normalRandom(min, max, stdDev) {
  min = min ? min : 0;
  max = max ? max : 1;
  stdDev = stdDev ? stdDev : (max-min)*0.10;

  var mean = (min + max) / 2;  
  var spare;
  var u, v, s;
  var randomValue;

  if(typeof window.mathjs_normalRandom_spare !== 'undefined')  {      
    randomValue = mean + stdDev * window.mathjs_normalRandom_spare;    
    delete window.mathjs_normalRandom_spare;
  }else{       
    do {
      u = Math.random() * 2 - 1;
      v = Math.random() * 2 - 1;
      s = u * u + v * v;
    }while((s >= 1.0) || (s === 0));

    s = Math.sqrt(-2.0 * Math.log(s) / s);
    window.mathjs_normalRandom_spare = v * s;
    randomValue = mean + stdDev * u * s;
  }  

  return randomValue;    
}

I'm not sure about keeping min, max and about the default standard deviation.

For my own use case I just need a min and max and in a way its nice to keep the function similar to the 'normal' random function. What do you think?

ericman314 commented 7 years ago

That's looking good. I'll let @josdejong comment on the use of global (static) variables; I'm not sure what his preference is on that.

I wonder if it would be better to use mean and stdDev as arguments to the function, rather than min and max, since theoretically the normal distribution doesn't have a minimum or a maximum value. I think it's what most people would be used to. Then you could convert your own max and min to a mean and stddev before calling normalRandom.

Also, be careful using max = max ? max : 1; -- if a user passes 0 for max, it will be reset to the default of 1.

josdejong commented 7 years ago

Thanks @koosvanderkolk! The usage of a global window.mathjs_normalRandom_spare is indeed a problem, though it's perfectly fine to keep some state in a function like normalRandom. Each instance of math.js should have it's own state though. For an example, see the random variable of the seededRNG function here, defined inside the factory function.

@koosvanderkolk would you be interested in working your solution out in a pull request? (that would be awesome)

koosvanderkolk commented 7 years ago

Thanks for your response! What about the arguments provided to normalRandom? Should these become

function normalRandom(mean, stdDev) {

And is 'normalRandom' the right function name?

I had a look at https://github.com/josdejong/mathjs/blob/master/lib/function/probability/seededRNG.js, but I'm not entirely sure whether I understand what is happening there.... And isn't this singletonRandom a global var as well?

josdejong commented 7 years ago

normalRandom(mean, stdDev) sounds like a good name and signature to me. It may make sense to look at the API's of other random libraries of JS or maybe Python for inspiration.

The ugly singletonRandom thing in seededRNG.js is to defeat a rare bug in seed-random, causing an infinite circular loop when it generates random entropy. I've tried to explain in the comments on top of seededRNG.js and link to the actual issue at the seed-random library. If you find an alternative solution that makes this construction redundant just go ahead :). Maybe a different random seed library doesn't have this issue, that could also be a solution.

koosvanderkolk commented 7 years ago

Thanks! About this pull request: I do have some time left, but could you provide me some guidance on the path to take? Mathjs is quite a overwhelming piece of code :-).

E.g. my current plan would be:

josdejong commented 7 years ago

It may be easiest to copy for example the random.js file, clear it, and start implementing normalRandom there. The new file should be added to probability/index.js to get it loaded in math.js.

Docs for a function are automatically generated from the code comments: just keep the structure that's there in random.js.

For unit tests, you can copy random.test.js and implement new tests. Run via npm test (all) or mocha test/function/probability/random.test.js to run just a single test file.

Please don't hesitate to ask if you have more questions!

troflog commented 4 years ago

What is the status of this?

josdejong commented 4 years ago

This issue is stil open, I think Koos didn't manage to implement this, I haven't heard back.

Anyone interested in picking this up?

koosvanderkolk commented 4 years ago

The project for which this was needed was suddenly cancelled, so I had to move on.

I would be happy to put some effort in it, but would still be struggling on how to implement the global var ("mathjs_normalRandom_spare" in my script) in a mathjs-way...

josdejong commented 4 years ago

Thanks for your update Koos. Please don't feel obligated to pick this up, only if you enjoy it :).

Since we had this discussion, we have had quite a deep change in the dependency injection solution of mathjs (since v6.0.0), it's good to be aware of that. I don't know the exact details of window.mathjs_normalRandom_spare again, but generally we should not rely on a global window with certain functions, but inject the dependencies when creating the function from a factory function. Maybe it helps to have a look at existing functions like factorial.js to understand the idea of factory functions in mathjs, and there is documentation here: extension.html#factory-functions.