kyonifer / koma

A scientific computing library for Kotlin. https://kyonifer.github.io/koma
Other
270 stars 23 forks source link

Better random number generator #80

Closed peastman closed 5 years ago

peastman commented 5 years ago

Random number generation is implemented with java.util.Random. That's a linear congruential generator, which isn't suitable for scientific applications. Or just about any other application where you care how the numbers are distributed. It would be good to replace it with something more robust.

The most popular choice is Mersenne Twister, but that's mainly for historical reasons. It has some statistical weaknesses, and it's generally inferior to newer algorithms. PCG has been getting a lot of attention lately. It's fast, and it passes every statistical test that's been thrown at it. It's also very easy to implement.

drmoose commented 5 years ago

See also #36

kyonifer commented 5 years ago

I've played around with alternative RNG methods a bit, in particular by piping a few twister implementations into some Gaussian generators such as fast-rng-java. Note that in addition to entropy concerns one must also consider speed, as for e.g. monte-carlo applications speed of the generator can be a large factor in overall performance.

Using or implementing our own thing in kotlin could also close #36. As mentioned on the youtrack ticket for kotlin.math.Random, Nat Pryce has an implementation of twister in pure kotlin. The author claims that "MersenneTwisterFast achieve[s] well over twice the speed of MersenneTwister. java.util.Random is about 1/3 slower than MersenneTwisterFast." I haven't verified the validity of the output distribution or run it through any statistical test suites, but it seems promising and was where I was going to start for closing out #36.

We've also got an implementation of polar form Box-Muller in pure kotlin for normal distributed numbers, although ziggurat would be better.

kyonifer commented 5 years ago

A search also turns up a partial implementation of PCG in kotlin. Looks like its already designed to be used as multiplatform for jvm/js. I've not tried it yet though.

peastman commented 5 years ago

The complete PCG algorithm is basically five lines of code. Here's the whole thing. It's also a lot faster than MT.

The only thing that makes java.util.Random slow is that it's thread safe. Other than that, it's completely trivial. The corollary is that the only way any RNG is going to be faster is to give up on thread safety, which isn't necessarily a good idea.

Hmmm. Although now that I think about it, you could optimize it for batch generation. If your main use case is filling matrices with random variables, you could acquire a lock, generate all the values, and then release it. That would be very fast. The downside is that generating single random numbers might be slower than using atomics, as java.util.Random does. (But not necessarily, at least if there's no thread contention. Biased locking in the JVM makes that pretty fast.)

kyonifer commented 5 years ago

The corollary is that the only way any RNG is going to be faster is to give up on thread safety, which isn't necessarily a good idea.

Not necessarily-- I've been able to get very fast performance out of a fork join pool-based rng that is still thread safe. Each call to randn used up available workers to mine random number sequences on free cores. Each call to randn uses different workers from the pool, hence it was still perfectly thread-safe (i.e. calls to rand from more than one thread was safe). You run into some seeding issues (similar to those in ThreadLocalRandom, which doesn't have the locking issues you mentioned) but those can be overcome with some initial synchronization and careful sequence weaving during the thread joining. Such an approach is much faster than naively using java.util.Random as it is single-threaded.

The complete PCG algorithm is basically five lines of code. Here's the whole thing. It's also a lot faster than MT.

To be clear, I'm not against PCG at all. Whatever is the easiest, most correct and fastest solution is good with me.

peastman commented 5 years ago

Right, I was assuming you wanted a single random sequence. If you're ok with multiple independent generators, you could just use a ThreadLocal to get a separate generator for every thread. The initial seed value provided by the user can be used to initialize a simple RNG (for example, a java.util.Random) which then produces the seeds for the real generators as they're created.

kyonifer commented 5 years ago

I'd love to scope out a fully concurrent RNG module (my ultimate goal with #36), where the user is able to configure exactly how concurrent it is. There's a lot of interesting things we could do. For example, we could have (disableable) low-priority background threads generate randomness into a buffer, so the pre-generated numbers are ready for the user thread when it needs them. The user could specify how many workers to spawn and whether or not they should run as an on-demand batch job or async background task.

However, we could break this up into smaller tasks. Step 1 is to switch over to our own simple solution, a single-threaded implementation that uses our favored algorithms, perhaps starting with PCG and either Box-Muller/Ziggurat.

peastman commented 5 years ago

Ok, I can implement that. Regarding thread safety, how about this? The actual generation of random numbers will be in non-threadsafe internal methods, for example nextDoubleUnsafe(). The public methods like nextDouble() will acquire a lock then call the internal method. So thread safety is guaranteed for all external code. Internal code has the option of acquiring the lock itself and then calling the internal method. That way, creating a random matrix will only involve acquiring it once, not once per element.

kyonifer commented 5 years ago

👍 That seems like a reasonable approach for a first cut, as it would allow us to generate larger matrices without lock spam. I can think of some useful additions (thread-local seeds, generation for Float matrices and ranges over integral types), but we'll save those for the second pass.

kyonifer commented 5 years ago

Closing this as solved by #82, and opened new issues #84 #85 #86 #87 #88 for the improvements discussed here.