HerculesWS / Hercules

Hercules is a collaborative software development project revolving around the creation of a robust massively multiplayer online role playing game (MMORPG) server package. Written in C, the program is very versatile and provides NPCs, warps and modifications. The project is jointly managed by a group of volunteers located around the world as well as a tremendous community providing QA and support. Hercules is a continuation of the original Athena project.
http://herc.ws
GNU General Public License v3.0
881 stars 755 forks source link

pseudo Random Number Generator improvements? #2565

Open IIIlllII opened 4 years ago

IIIlllII commented 4 years ago

First of all excuse my ad hoc approach. If time allows I'll come up with a more scientific approach with better documentation. Is your feature request related to a problem? Please describe.

  1. The following is actually more a question and less a request or proposal. Therefore I cannot present any solution or alternatives since I might be missing here something. If I'm not mistaken the frequently used method to calculate random numbers is genrand_int31() from mt19937ar.c (according to random.c and random.h). I'm wondering why the return value is shifted by 1 bit >>, since this pretty much removes 2^31 entries of the set of potential numbers.

  2. If I'm not mistaken the pRNG in mt19937ar.c is a multiplicative congruential generator. This could lead to repetitions and alternations especially in the lower bits field. This itself can lower the quality of random numbers but it might be increased if not even exponentiated by the fact how for example items to be dropped are chosen. According to mob.c this calculation is used: "if (rnd() % 10000 >= drop_rate)". So only entries of ~0x3F of a random number are used to determine whether to drop an item or not. All of the mentioned above might result in a poor distribution.

I ask you for any input or feedback. Please do correct me if anything is wrong or seems to be wrong. I really really appreciate it.

Describe the solution you'd like One could contemplate to use a more sophisticated pRNG overall. To my knowledge there are a lot of very capable generators which to my understanding should be free to use, either published CC0 or GPL. They could improve the load on the server though which might not be desirable.

Describe alternatives you've considered If you should choose not to change the pRNG there could be another improvement. To my understanding changing the line "if (rnd() % 10000 >= drop_rate)" to "if (Math.floor((rnd() * 10000)/ 2^32) >= drop_rate)" [assuming int32 architecture and overflows to be carried over] could improve the significance of higher bits of random numbers for the distribution. Please do correct me on any mistakes and errors. Thank you for your time.

MishimaHaruna commented 4 years ago

You mention that the PRNG we're using is a multiplicative congruential generator, but to the best of my knowledge, it's a twisted generalized feedback shift register generator.

It's true that LCG/MCG generators and GFSR/TGFSR generators have several points in common, and that the MT we're using has some flaws (such as a very slow recovery from some initial states - which is resolved in an updated 'SFMT' version that I'm aiming to switch to as soon as I have time to test it), but it is generally considered a good and fast PRNG and very widely used. Details and papers at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html

Do you have any data that you can share with us, that shows the way we're using it results in a poor, unsuitable distribution?

Concerning the other point you mention, the use of genrand_int31(), the reason is that it returns a positive signed 32 bit integer (in the range [0-0x7fffffff] - 31 bits + 1 sign bit, always 0), which is what we need most of the time, and is safe to store in an int without causing an underflow. It does so by sacrificing the least significant bit of the original 32 bit result.

Switching to another PRNG is of course an option, if there's any evidence that the Mersenne Twister isn't suitable for our needs, and if there's an efficient alernative (again, please provide data to back your claims)

peteroupc commented 4 years ago

There are many alternatives to the Mersenne Twister, which is a generator with a number of disadvantages explained in further detail in "It Is High Time We Let Go of the Mersenne Twister" by @vigna.

Perhaps the biggest disadvantage to Mersenne Twister is its big state (2500 bytes) compared to other modern PRNGs, which are typically 128 to 256 bits in size and have comparable randomness quality.


It would be better if this project used a cryptographic RNG in most places, especially since this is an online game. Unfortunately, however, due to the way this project uses a global RNG and allows plugins to seed that global RNG for reproducibility, it may be difficult to change the RNG without affecting compatibility with plugins.

In general, as I explain in my RNG recommendations, an API that "implements an automatically-seeded RNG ... SHOULD NOT allow applications to initialize that same RNG with a seed for reproducible 'randomness'", because doing so "would hamper forward compatibility".

As it is, this project can change the RNG's implementation (given in random.c) without affecting backward compatibility — until some code in the entire program (whether in a plugin or elsewhere) calls rnd_seed, and then the entire program would have to rely on the sequence determined by that seed, and by extension, rely on what implementation the RNG uses.

Moreover, the seed taken by rnd_seed can only be 32 bits long, which, in the case of Mersenne Twister, is a tiny fraction of the number of seeds the PRNG can admit in theory (namely 2^19937 - 1 seeds).

MrKrzYch00 commented 2 months ago

I will let myself describe the problem some may perceive with the normal, traditional, built-in rand() and finding something strange in the results, thinking it's the fault of the RNG where the RNG actually works as expected. Finally switching to other solutions which, rather than eliminating the problem completely, make it so small that it's practically unnoticeable.

The problem lies in how we, coders, believe that running modulo operation on a random number from any range of numbers is just the way it should be done, where it isn't. But if the maximum is high enough it's fine, right? Closer, but not entirely there.

The right solution to the problem of built-in C's RNG giving us unexpected results when combined with modulo operation is limiting the initial range of generated random number to perform modulo operation on, if not, we're having a higher probability for a "leftover" to be preferred choice. If such limit cannot be forced like, for example, random.randint(0, 29999) in python, we have to keep fetching random numbers until one of them falls in the correct range of numbers FIRST, then use modulo operator on that said number. This is the solution that avoids random modulo bias.

The simplest way to test for bias is to create a list of numbers. To this list add 1 million results of rand()%10000, where RAND_MAX = 32767, finally count 1-2767 and [0, 2768-9999] separately and equalize the weights. You should finally notice that the numbers of 1-2767 have been given a "priority", being ~1.33:1 (if I did the math correctly there). Alternatively, if we ensure that the number is in range of 0-29999 first, then perform as many modulo operations, with the overhead of calls to rand() being at ~9% (for this particular scenario of modulus = 10000; the lower the modulus, the less overhead there should be), we should get the result on pair with other solutions using much higher upper number for random number generation (and probably adding an unnecessary overhead in memory and CPU consumption as well), and yet we should eliminate the bias completely.

Graphical representation of the problem of running rand()%10000 for RAND_MAX = 32767, for 1,000,000 tries, without additional preparation before running modulo on such generated random number (no changes in weights, just final counts from the 1,000,000 elements list): obraz

Graphical representation of the same but with rand() in a loop, discarding any number that shouldn't be used to perform modulo operation on: obraz

Depending on the final check (type of comparison as well as the value to check against), and if RNG modulo bias is not eliminated, it can favor or disfavor certain items to drop with higher chance than expected, as well as various "has x% chance" effects to have a higher chance than expected. Depending on the upper bound in random number generator, as long as every single number doesn't have the same mathematical probability to occur in the final equation result, the bias will decrease with the increase of the said upper bound, but will still exist.

Finally: In almost all places (besides one line in battle.c) this project, as I can see, uses an RNG interface and seems to use genrand_int31 with the maximum value being 134,217,727. The value is high enough to reduce the bias considerably, yet not eliminate it. While the probability of the leftover to be preferred is very little, it isn't none. At this point it's all about the preference of the dev team to either eliminate the bias completely or keep that negligible imperfection, or switch to the most optimal solution performance-wise (or pick it depending on the scenario), having in mind that it would have to be measured with bias elimination as a whole.