imneme / pcg-cpp

PCG — C++ Implementation
Apache License 2.0
735 stars 98 forks source link

At least in pcg32 first number of stream is always the same #85

Open timoscs opened 1 year ago

timoscs commented 1 year ago

The apparent reason is that when the engines output_previous template parameter is true, the previous internal state generated with the previous increment is returned first. This is at least unintuitive, and I'd even consider it a bug. Maybe set_stream should bump the internal state when output_previous is true (though I have to admit I don't really see the purpose of outputting the previous state in any case, but I'm sure I'm just missing something?).

Example code:

pcg32 generator1{};
pcg32 generator2(generator1);
generator1.set_stream(0xF00BA8);
generator2.set_stream(0xBAA8F00); 
// Uncomment to "Pump out" the stored internal state and make the assert pass
// generator1();
// generator2();
assert(generator1() != generator2()); // FAILS!

Also, if I use stream numbers 1 and 2, or for example 73, 146 or such simply generated seeds, the first two numbers generated are identical, but I suppose that's what's described shortly in https://www.pcg-random.org/posts/critiquing-pcg-streams.html in the STOP PRESS parentheses statement. I'd be more than happy to read a slightly longer discussion of how to safely initialize a number of streams, initially just given consecutive integers as instance numbers?

tbxfreeware commented 1 month ago

You nailed it. output_previous is what causes this.

Note, however, that the two generators in your example both use the same seed. Rather than using different seeds to generate different sequences, you are using identical seeds, but different streams. A sounder approach would be to use different seeds and different streams.

Three obvious workarounds occur to me.

  1. Use a different seed in every generator.
  2. Use the same seed in every generator (if you must), but then cycle each generator a "small" number of times, after you set its stream.
  3. Define a new engine, named pcg32_current, which outputs the current state, rather than the previous state.

To accomplish number 3, note that the definition for type alias pcg32 takes place at the file-level, but references entities that are defined in two different namespaces.

// From `pcg_random.hpp`:
namespace pcg_detail
{
template <typename xtype, typename itype,
         template<typename XT,typename IT> class output_mixin,
         bool output_previous = (sizeof(itype) <= 8),
         template<typename IT> class multiplier_mixin = default_multiplier>
using setseq_base = engine<xtype, itype,
                         output_mixin<xtype, itype>, output_previous,
                         specific_stream<itype>,
                         multiplier_mixin<itype> >;
}
namespace pcg_engines
{
    using namespace pcg_detail;
    typedef setseq_base<uint32_t, uint64_t, xsh_rr_mixin>  setseq_xsh_rr_64_32;
}
// File-level, "global" scope
typedef pcg_engines::setseq_xsh_rr_64_32        pcg32;

Using these as our guide, we can create a new type alias for PCG engine pcg32_current, an engine in which output_previous is set to false.

#pragma once
// pcg32_current.hpp
#include "pcg_random.hpp"
namespace pcg_detail
{
    // `setseq_base_current` is identical to `setseq_base` from 
    // the file `pcg_random.hpp`, except for one thing:
    // template parameter `output_previous` is set to `false`.

    template <typename xtype, typename itype,
        template<typename XT, typename IT> class output_mixin,
        bool output_previous = false,   // <--------------------------------- false
        template<typename IT> class multiplier_mixin = default_multiplier>
    using setseq_base_current = engine<xtype, itype,
        output_mixin<xtype, itype>, output_previous,
        specific_stream<itype>,
        multiplier_mixin<itype> >;
}

using pcg32_current = pcg_detail::
setseq_base_current<uint32_t, uint64_t, pcg_detail::xsh_rr_mixin>;

// end file: pcg32_current.hpp

Here is the test routine I ran:

#include "pcg32_current.hpp"
// ...
void demo_pcg32_current(std::ostream& log)
{
    log <<
        "===========================\n"
        " Demonstrate pcg32_current \n"
        "===========================\n\n";
    pcg32 p;
    pcg32_current e1, e2;
    std::streamsize const w{ 17 };
    log << "Default constructed, no seed or stream selected.\n"
        << "Note that `pcg32` trails `pcg32_current` by 1 cycle.\n\n"
        << std::setw(w) << "pcg32"         << std::setw(w+1) << "pcg32_current\n"
        << std::setw(w) << "-------------" << std::setw(w+1) << "-------------\n";
    for (int n{ 5 }; n--;) {
        log << std::setw(w) << p()
            << std::setw(w) << e1()
            << '\n';
    }
    log << "\n\n";
    // Set seed and stream simultaneously.
    std::uint64_t const default_seed{ 0xcafef00dd15ea5e5ULL };
    e1.seed(default_seed, 0xF00BA8);
    e2.seed(default_seed, 0xBAA8F00);
    log << "Same seed, different streams\n"
        "Note that output is different in each stream, even on the first call.\n\n"
        << std::setw(w) << "pcg32_current" << std::setw(w+1) << "pcg32_current\n"
        << std::setw(w) << "0xF00BA8"      << std::setw(w+1) << "0xBAA8F00\n"
        << std::setw(w) << "-------------" << std::setw(w+1) << "-------------\n";
    for (int n{ 5 }; n--;) {
        log << std::setw(w) << e1()
            << std::setw(w) << e2()
            << '\n';
    }
    log << "\n\n";
}

And here is the output. Although this test uses the same seed for both engines, in practice, I would use a different seed in each engine.

===========================
 Demonstrate pcg32_current
===========================

Default constructed, no seed or stream selected.
Note that `pcg32` trails `pcg32_current` by 1 cycle.

            pcg32    pcg32_current
    -------------    -------------
        676697322        420258633
        420258633       3418632178
       3418632178       3595600211
       3595600211       3265791279
       3265791279        257272927

Same seed, different streams
Note that output is different in each stream, even on the first call.

    pcg32_current    pcg32_current
         0xF00BA8        0xBAA8F00
    -------------    -------------
       4240486865       3590338339
         76144786       2690060698
       2851464140       2389237566
       3446770307       2123762214
       2676135859        552064717