art4711 / random-double

An algorithm for generating random doubles.
12 stars 1 forks source link

Relevant prior art: dSFMT #1

Closed pao closed 9 years ago

pao commented 9 years ago

(I tried to post this as a comment to your blog post, but it didn't post, and there was no message telling me it was going into a moderation queue. So I apologize if this is redundant.)

You might want to take a look at dSFMT, which directly generates double-precision random numbers, including uniform on [0, 1): http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/

This algorithm still has undesirable statistical properties, however, as discussed at JuliaLang/julia#6464.

art4711 commented 9 years ago

I've actually seen a paper about it. It spent a lot of time talking about the PRNG which is a quite solved problem (mersenne twister isn't the answer other than some very specialized applications where you care about speed before quality) and then did some hand-waving and suddenly we magically had floating points. It's the hand-wavy bit that I'm actually interested in.

Looking at the code it's completely ignoring the problem of mapping between sets with different number of elements or with different precision. For example, generating perfectly random numbers [1,2) is trivial, it's one of the first things I did in rd.c. Mapping that to [0,1) without losing information can not be done by simply subtracting 1.0, but that's exactly what that code does.

I just committed a file called "some-more-tests.c" that shows why that doesn't work.

pao commented 9 years ago

mersenne twister isn't the answer other than some very specialized applications where you care about speed before quality

Off topic a bit, but those applications aren't that specialized. At any rate, as you identify it's not the bit that's relevant to what you're exploring.

Looking at the code it's completely ignoring the problem of mapping between sets with different number of elements or with different precision. For example, generating perfectly random numbers [1,2) is trivial, it's one of the first things I did in rd.c. Mapping that to [0,1) without losing information can not be done by simply subtracting 1.0, but that's exactly what that code does.

Whee! That is indeed a problem.

If you're looking for help, you may find some sympathetic bit-twiddlers hanging around the Julia community. We've been looking at alternatives to MT as default random number generator (JuliaLang/julia#8786) but that discussion hasn't addressed correctly extending such a generator to floating point.

art4711 commented 9 years ago

We've been looking at alternatives to MT as default random number generator

Please, please, please. Whatever you do, pick a cryptographically secure generator as the default and then maybe make it overridable to something worse if someone really wants it. People screw it up so often.

Look at the recent talks and rants from openbsd about the topic: http://www.openbsd.org/papers/hackfest2014-arc4random/index.html http://www.tedunangst.com/flak/post/random-in-the-wild http://marc.info/?l=openbsd-tech&m=141776286105814&w=2 http://marc.info/?l=openbsd-cvs&m=141807513728073&w=2 http://www.openbsd.org/papers/eurobsdcon2014_arc4random/index.html

pao commented 9 years ago

Choosing a CSPRNG as the default is unlikely--it serves to make choosing random variates (which is what technical computer users are generally doing) slow, to get strength that the majority of the users don't need. (I note that of the list of things that Ron Rivest complained about on our mailing list, this was somewhat surprisingly not one of them.) The trouble is that for the applications that Julia is targeted at, your users (i.e., me) don't need security, they need thousands of Monte Carlo replications completed quickly. Statistical randomness is sufficient for that.

So while your concern is appreciated--and despite my sympathy for "safe by default" choices--this one's hard because it is such a pain point for exactly the group of users Julia attracts.

Feel free to state a case in that issue, though; you'll certainly be more effective than I would at it (since I have decidedly mixed feelings on the matter)!

Anyways, we've strayed quite far from the base topic...rather than use this as my personal message board to you, I'll let this lie in peace as you have in fact seen dSFMT and found it (for good reason) wanting.

art4711 commented 8 years ago

I just remembered this conversation. You probably won't get this (I have not idea how github notifies people), but in case you do, I did a thing: https://github.com/art4711/randbench (regarding the "CSPRNG is slow" meme) and https://github.com/art4711/unpredictable. With the new compiler that will be in go 1.7, I'm down to 25ns per 64 bit number which is just 2.5x slower than the default PRNG in Go.

pao commented 8 years ago

Github does notifications quite well, as it happens. On the other hand, I'm not really strongly coupled to Julia development at the moment. But I'll glom your comment onto https://github.com/JuliaLang/julia/issues/8786. Thanks for letting me know!