Bubb13 / EEex

An executable extender for Beamdog's Enhanced Edition of the Infinity Engine
50 stars 7 forks source link

request to add support for drake127/randomrolls .exe patch #52

Open ewelsh42 opened 1 year ago

ewelsh42 commented 1 year ago

drake127 has released a mod (https://github.com/drake127/randomrolls) that patches the infinity engine .exe files to replace the rather poorly behaving standard C rand() random number generator with the RDRAND instruction available on more modern processors, which should greatly improve the randomness (no more improbably frequent clusters of high or low rolls, no more repeated sequences of rolls). Currently, EEex and/or InfinityLoader doesn't support the modified binaries, so EEex fails to load. Could you please add support for the randromrolls-patched versions of the various infinity engine binaries, so that we can have EEex with improved random numbers?

Does EEex already replace the random number generator with a better one? If not, could it be modified to do so, so that users with older processors that do not have the RDRAND instruction could benefit from improved random numbers as well?

Bubb13 commented 1 year ago

I have replaced the pattern InfinityLoader uses to find rand() so that patching the function in-exe doesn't break EEex. I'll keep this issue open to remind me to look into improving rand() without RDRAND.

ewelsh42 commented 1 year ago

Awesome! Thanks a lot! I don't know much about Lua, or what version the EE games are using, but it looks like math.random() in Lua versions >= 5.4 uses the xoshiro256** generator, which is, arguably, one of the best around. Lua <= 5.3 used rand(). I know even less about how you dynamically patch new code into the programs, but if the Lua version is new enough, Lua already has a great built-in random number generator without having to implement one yourself?

If you do need to implement one yourself, I'd like to recommend checking out the website of the xoshiro256** authors (https://prng.di.unimi.it/#shootout). Their publications are interesting reads, and the website has well-commented public domain C code that is easy to read. The algorithms are pretty straightforward to implement, mostly involving a few bit-shifts and xors on the relatively small state array.

suy commented 1 year ago

FWIW, with modern C++ it is very easy to have a proper number generator that uses hardware for the seeding, but uses a PRNG for the repeated calls. I've not checked if new language features provide something even simpler/better, but with plain C+11, this is good enough:

#include <random>
#include <iostream>

int main()
{
    // 32 bit Mersenne Twister. No need for distribution as it already generates
    // values between 0 and 2^32 - 1 (if that's what you want).
    static std::mt19937 rand = [](){
        // This immediately executed lambda is a trick so device and engine are
        // created only once (due to the static) since repeated creation would
        // produce the same numbers without a seed, and we don't want to seed on
        // each invocation and waste entropy. And we also don't want the
        // random_device to be kept alive since might keep a file handle open
        // (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0205r0.html).
        // This way only the Mersenne Twister is moved out of the lambda on
        // first use, and the rest should be released.
        std::random_device device;
        std::seed_seq seed{device(), device(), device(), device(),
                           device(), device(), device(), device()};
        return std::mt19937(seed);
    }();
    std::cout << rand() << std::endl;
    std::cout << rand() << std::endl;
}

The std::random_device uses hardware entropy if available, so it is not to be abused since it can be slower. And falls back to software if unavailable. This is just using it to seed the pseudo random number generator (the Mersenne Twister algorithm). But this one is of much better quality than old rand(), and should not produce the usual known problems.

If you want a specific range to be supported for compatibility with rand(), you need to plug a distribution, but it's easy:

int upgradedRand()
{
    static std::mt19937 prng = [](){
        std::random_device device;
        std::seed_seq seed{device(), device(), device(), device(),
                                        device(), device(), device(), device()};
        return std::mt19937(seed);
    }();
    std::uniform_int_distribution<> distribution(0, RAND_MAX);

    return distribution(prng);
}

int main()
{
    for (int i = 0; i < 10; ++i)
        std::cout << upgradedRand() << std::endl;
}

I would need to review if something new/easier has come up in newer C++, but I think this is more than enough for the IE. And, as Bubb knows, I did some research some time ago, and it did not seem like the rolls where that bad for regular gameplay. Sure, it's a trashy random number generator, but it's impossible for players to see patterns when CUtil::UtilRandInt gets called for different things that the player doesn't even see.

ewelsh42 commented 1 year ago

The xoroshiro authors have an interesting paper about several shortcomings of the Mersenne Twister (http://arxiv.org/abs/1910.06437). Although these generally shouldn't be noticeable in an RPG (and were never noticeable in Temple of Elemental Evil), generators with better statistical properties, and better tests for them, have come out since then, so, since they are simple to code up and aren't, to my knowledge, patent or license encumbered, I'd recommend using one of the them instead. See https://prng.di.unimi.it/ for a good survey of current-ish generators and which tests they do or do not fail, along with their corresponding detailed manuscript (https://vigna.di.unimi.it/ftp/papers/ScrambledLinear.pdf).

Anecdotally, Mersenne Twister should still be OK for an RPG. I was part of a forum discussion back in the day for Temple of Elemental Evil, in which the developers stated they used the "Quick and Dirty" generator from Numerical Recipes. The Numerical Recipes authors state, in the book, that this doesn't generate random numbers, just somewhat random numbers -- truly quick and dirty. As such, when I tested it with George Marsaglia's DIEHARD battery of randomness tests, it failed miserably, and wasn't even able to pass one of the simulated card game tests. This resulted in very noticeable problems in ToEE, such as improbably frequent runs of your entire party rolling 1's and 2's in a row, or being able to easily roll up superheroes during character generation due to improbably frequent runs of high rolls. The developers eventually relented and switched to the Mersenne Twister in their final released patch, and it made a HUGE difference in gameplay. I never noticed any issues with randomness in ToEE after the final patch, Mersenne Twister appeared to work just fine for RPG purposes. I'd still recommend the xoroshiro256 generator instead, though. Just seed the small state array with SplitMix64() calls, after seeding SplitMix64() with time or something -- no need for a hardware seed, SplitMix64() is sufficiently random and independent of xoroshiro256 for good state initialization. Why settle for passing most statistical tests of randomness, when we can pass them all, using smaller code, less memory, and fewer CPU cycles per call? :-)

The ToEE generator might have been worse than rand(), and the nature of the game being turn based with all the rolls easily visible made it that much easier for the player to spot that something was up. I, too, have never noticed or heard of people complaining about randomness during IE gameplay, but, maybe the real time with pause nature makes it much less likely anyone would notice the same sorts of issues that originally plagued ToEE? drake127 evidently noticed something was up during character generation, so, even though rand() isn't as glaring of a problem as the original generator in ToEE, I think replacing it with a more modern, statistically robust generator can only be a good thing. It would also serve to squash, once and for all, any potential future questions about the quality of the random numbers in the game :-)