CleverRaven / Cataclysm-DDA

Cataclysm - Dark Days Ahead. A turn-based survival game set in a post-apocalyptic world.
http://cataclysmdda.org
Other
10.67k stars 4.18k forks source link

REQ: Use a cryptographic quality random number generator #5002

Closed SiGhTfOrbACQ closed 10 years ago

SiGhTfOrbACQ commented 10 years ago

It is impossible to 'balance' a game, when the random number generators vary in quality and level of randomness across platforms.

By using a cryptographic quality random number generator - we can be sure we are balancing the game on all platforms and not be 'balancing' a 'poor' random number generator on a specific platform.

The 'gold standard' random number generator is a Mersenne Twister. Here are three examples of a Mersenne Twister:

kevingranade commented 10 years ago

I'm not convinced the distribution of Mersenne Twister is significantly more even than system rand calls, we don't care about other cryptographic properties of the rand algorithm, such as cycles. Some light reading on the subject pointed me at http://www.azillionmonkeys.com/qed/random.html however, which points out that we might be introducing a slight bias in our handling of the output of rand(). This would be systemic regardless of the source of randomness, so that might be worth taking a look at.

SiGhTfOrbACQ commented 10 years ago

Except it isn't working. We recently had a player on IRC whom had a game that created 168 canabis in one of his basements. Everybody agreed that we needed to re-balance the game because of this. However, this is not something I have ever encountered. So, its not reproducible - but it obviously happened. From this I can conclude that cataclysm is producing unbalanced games on one platform and not another. This is precisely because:

  1. rand() is not a great source of pseudo-random!
  2. rand() is implemented differently on each of the three platforms, leading to different qualities of pseudo-random.
  3. Seeding periodicity is important to get good pseudo random from pseudo random number generators. With low periodicity you get predictable results.

It is basic science to control variables. To control the variables in this case is to eliminate as much as possible those things that can be sources of variability. If we over engineer the random number generator; we eliminate the problems like the one that we experienced on IRC. It is also good software engineering. By eliminating a contention; we know that efforts to balance the game are efforts that produce expected results as we know that the the source of random is not the issue due to the over engineering.

Additionally, its already built in to the standard C++ libraries, and over thousands of samples it produces much better pseudo-randomness.

kevingranade commented 10 years ago

"Everybody agreed that we needed to re-balance the game because of this.", Sorry I missed the party, but I neither find that a compelling case for re balancing the game, nor an indictment of the quality of random number generation. There IS a compelling case for reworking item spawning, it uses a crazy iterative approach not guaranteed to terminate except by the law of large numbers, no matter the quality of your random number generator. That is most likely the real culprit of the excess cannabis.

Are we reading the same page? "However, the max error value for Mersenne Twister tells that there were situations where standard rand produced significantly better results." Mersenne Twister had SOME better runs, but rand(), when handled properly, had less severe outliers, and the average error was better.

So finally I conclude, that for getting random value in range [0,LIM)
this line:

 random_value = rand() % LIM;

is fine for almost all non scientific applications.

If we were using stl++11, I could strongly consider using its implementation since that would be negligible in development overhead, but due to the fact that we use a wide variety of compilers (VS and mingw are the offenders) and the fact that C++11 doesn't have support for partial adoption (no way to enforce a subset of the language), we aren't currently using it, so it would require importing an implementation, which just isn't compelling.

swwu commented 10 years ago

@dennisgroves, was going to write a response but @kevingranade already hit most of the points I was going to bring up.

I do want to mention that the test methodology in the linked article is probably more a test of the parameters/configuration in each implementation than the actual quality of the PRNG algorithm, since it's purely a test of the evenness of each PRNG's uniform distribution (which varies heavily depending on configuration).

The actual largest differences between a cryptographically secure and insecure PRNG are:

Since both of these qualities depend on the ordering of the outputs, neither of them are actually measured when considering the outputs as a binned unordered whole, like the linked article does. It's also, of course, perfectly fine that rand() possesses neither of these qualities, since the ability of item spawn distributions in CDDA to resist cryptanalysis is not high on the feature list.

Incidentally, the measure of PRNG "quality" in the linked article diverges so strongly from what's normally considered in determining the quality of a sequence of random numbers that I'm not certain the author or his article should really be invoked as any kind of authority on the subject.