imneme / pcg-cpp

PCG — C++ Implementation
Apache License 2.0
745 stars 99 forks source link

Question regarding usage #69

Closed eike-fokken closed 3 years ago

eike-fokken commented 3 years ago

Hi there!

I'm trying to use PCG for some hopefully cross-platform code. As I don't know where my code will be compiled and used, I would like to make sure that random numbers are as random-like as possible.

Therefore I thought, that seeding pcg32 and the like with both a std::random_device and

static_cast<std::random_device::result_type>(
            std::chrono::high_resolution_clock::now()
                .time_since_epoch()
                .count())

but I have no idea how to do that. In addition I didn't gather from https://www.pcg-random.org/ how to seed the generators with deterministic data e.g. in order to make reproducible runs. Could someone point me in the right direction? Even more appreciated would be some sample code like:

// add time here?
pcg_extras::seed_seq_from<std::random_device> seed_source;

/// add current time here?
pcg64 rng(seed_source);

which uses both std::random_device and the current time in high resolution for entropy or even better with the additional option to provided entropy by the user.

I'm not sure, this is the right place for this, so feel free to send me elsewhere.

lemire commented 3 years ago

You seem to be aware of the API. Won't the following do?

rng.seed(pcg_extras::seed_seq_from<std::random_device>());
eike-fokken commented 3 years ago

Thanks for the quick reply! As for your hint: I don't know. Would that incorporate the current time? I must say that I'm apparently not good enough in C++ to understand the definition of engine::seed:

    template<typename... Args>
    void seed(Args&&... args)
    {
        new (this) engine(std::forward<Args>(args)...);
    }

This looks to me like it throws out the old seed data and replaces it with new seed data from the random_device. Is that correct?

lemire commented 3 years ago

Would that incorporate the current time?

Why would you seed the current time when you can seed with std::random_device?

In C++, there are basically two ways to go about it...std::random_device or a fixed seed. You seem to what to do something else. Can you explain what you want to do?

eike-fokken commented 3 years ago

Can you explain what you want to do?

What I actually want to do is this: Whoever uses my code should see random values (my application is choosing samples from some distribution for scientific simulations). I'm afraid, if I solely rely on random_device, someone might use it on a computer that doesn't have a hardware rng, or, if I understand correctly, every user of mingw would face very deterministic values.

My idea for fixing this, was to seed with the random_device AND the current time. But maybe its a bad idea. I guess I can do this by using the randutils.hpp header and this:

    std::random_device ro;
    randutils::auto_seed_256 seed_source{
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        ro(),
        static_cast<std::random_device::result_type>(
            std::chrono::high_resolution_clock::now()
                .time_since_epoch()
                .count())};

    pcg64 rng(seed_source);

But I don't know how many invocations of ro() to call and alone from the look of the code this looks to me like I don't understand something...

eike-fokken commented 3 years ago

I now arrived at this code:

    std::random_device ro;

    std::array<uint32_t, 11> args(
        {static_cast<uint32_t>(std::chrono::high_resolution_clock::now()
                                   .time_since_epoch()
                                   .count()),
         static_cast<uint32_t>(ro()), static_cast<uint32_t>(ro()),
         static_cast<uint32_t>(ro()), static_cast<uint32_t>(ro()),
         static_cast<uint32_t>(ro()), static_cast<uint32_t>(ro()),
         static_cast<uint32_t>(ro()), static_cast<uint32_t>(ro()),
         static_cast<uint32_t>(ro()), static_cast<uint32_t>(ro())});
    randutils::auto_seed_256 seed_source(args);

    pcg64 rng(seed_source);

which hopefully uses the maximum number of entropy from the truly random source and the time for good measure.

eike-fokken commented 3 years ago

Oh... I just realized that this is stupid and I should just use

randutils::auto_seed_256 seed_source;

because the default constructor does all this entropy gathering.

Sorry for robbing you of your time.

lemire commented 3 years ago

For people reading this, @eike-fokken is following Melissa's blog post https://www.pcg-random.org/posts/simple-portable-cpp-seed-entropy.html and the accompanying code samples.

eike-fokken commented 3 years ago

Very much so. To also answer my implicit other question: One can pass a number of uint32_t to the constructor of auto_seed_256 to build the seed. That way one gets a deterministic PRNG.

imneme commented 3 years ago

Back when I posted about randutils.hpp on reddit, there were various valid critiques of the auto_seeded class. It's not quite as portable as I'd like. Six years on, I might do some things a bit differently, but it's still probably good enough for your needs.

It's worth remembering that auto_seeded is derived from seed_seq_fe, so if you want reproducible results you can call param to get equivalent initialization parameters that you could pass to seed_seq_fe directly (via the constructor or its seed method).