Open Darendis opened 1 year ago
This is expected. The implementation of << contains the following (include/pcg_random.hpp:592
). Most PCG configurations don't modify the multiplier or increment, so the first part of the output will always be the same.
out << rng.multiplier() << space
<< rng.increment() << space
<< rng.state_;
O'Neill's implementations of operator>>
and operator<<
allow one to read and write states between incompatible PCG engines. So long as their multipliers match, for instance, you can write the state from a pcg32_fast
engine, and read it back into a pcg32
engine.
This is problematic for several reasons that are listed below, in comments taken from my "annotated" version of pcg_random.hpp.
In the case mentioned above, it is also buggy.
A pcg32_fast
engine writes 0 as its increment. This can be read by a pcg32
engine because it allows a variable increment, and does not check the new_increment it reads (even though it should!). The pcg32
engine then right-shifts the new_increment to convert it into a stream, and calls function set_stream
to install it. The exact call is rng.set_stream(new_increment >> 1);
When the new_increment is 0, however, new_increment >> 1
is also 0, so this call is the same as rng.set_stream(0);
.
The final step is to install the new stream by left-shifting, and or-ing in the required 1. Here is the code:
void set_stream(itype specific_seq)
{
inc_ = (specific_seq << 1) | 1;
}
This has the effect of setting inc_ to 1.
Whoops! We read the increment as 0, and we installed the increment as 1! That's a bug.
The fix is to require, for specific_stream only, that (new_increment & 1u) == 1u. (For other stream_mixins, new_increment must match the existing increment.) A better fix would be to disallow reading and writing between dissimilar PCG engines.
I have written stream I/O functions that correct these problems, and a few others as well. If you would like to see them, let me know.
Here are my notes from the annotated file:
//======================================================================
// Stream Insertion and Extraction Operators
//======================================================================
// Notes by tbxfreeware:
// For the purposes I/O stream insertion and extraction, the "state" of
// a PCG engine is comprised of three elements:
//
// 1. multiplier()
// 2. increment()
// 3. state_
//
// For a given PCG engine, operator<< writes all three to the output
// stream.
//
// For a completely different engine, constructed perhaps with a
// different xtype, itype, output_mixin, output_previous,
// multiplier_mixin, and/or stream_mixin, operator>> can successfully
// input the "state" under these conditions:
//
// 1. Its multiplier() member function returns the same value as the
// value read from the input stream.
//
// 2. Either of two things is true:
// a. Its stream_mixin is specific_stream, or
// b. Its increment() member function returns the same value as the
// value read from the input stream.
//
// There are problems with this approach.
//
// O'Neill defines stream I/O operators as though their job were the
// input and output of LCG state. It is not. Their job is to input and
// output the state of a PCG engine. PCG state is not the same thing
// as LCG state.
//
// Suppose we are given two PCG engines, e1 and e2. When the state of
// engine e1 is written to an I/O stream, and is then read by engine e2,
// the two engines should be placed into the same state. They should
// compare as equal, and they should produce the same sequence of
// random variates when operater() is called. If these things cannot
// be made to happen, then the read operation should fail.
//
// What I'm saying here is that xtype, itype, output_mixin, and
// output_previous, at least, are part of the state of PCG engine.
// Matching multiplier() values is also part of the state,
// and unless the stream_mixin is specific_stream, so is the need
// to have matching increment() values.
//
// As for xtype and itype, we already know that they are unsigned
// integral types, so it is enough to record their sizes as part of
// the state.
//
// output_mixin and stream_mixin could have "id" or "tag" fields added
// to their structs to identify them. Type int would do the job;
// an enumeration constant, more elegantly.
//
// Suppose these "global" enumeration classes were defined in namespace pcg:
//
// namespace pcg
// {
// enum class stream_model : int
// {
// specific, oneseq, unique, no_stream
// };
// enum class output_model : int
// {
// xsh_rs, xsh_rr, rxs, rxs_m_xs, rxs_m, dxsm, xsl_rr, xsl_rr_rr, xsh, xsl
// };
// }
//
// Each mixin could then define a function that returned an identifying
// enumeration constant, for example:
//
// class no_stream
// {
// auto static stream_model { return pcg::stream_model::no_stream; }
// ...
// }
// class xsh_rs_mixin
// {
// auto static output_model { return pcg::output_model::xsh_rs; }
// ...
// }
//
// You could not directly read or write the enumeration constants, but
// they could easily be cast to type int for purposes of stream I/O
// operations when reading or writing the state.
//
// Finally, the boolean output_previous could also be converted to int
// when reading or writing state.
//
// In conclusion, there are eight fields that need to be written to
// record the state of a PCG engine:
//
// 1. sizeof(xtype)
// 2. sizeof(itype)
// 3. static_cast<int>(stream_mixin::stream_model())
// 4. static_cast<int>(output_mixin::output_model())
// 5. static_cast<int>(output_previous)
// 6. +multiplier() // unary plus forces integral promotion
// 7. +increment() // obviating the need for custom operator<<
// 8. +state_
//
// When the state is input, item 6-8 should be read into temporary
// variables of type uintmax_t or pcg128_t, thus obviating the
// need for a custom operator>>.
//
// Successful input should require matching items 1-6 exactly,
// and sometimes item 7 (for all except stream_model::specific).
//
// For stream_model::specific, require (increment & 1u) == 1u,
// thus trapping the bug identified above.
//
// For stream_model::no_stream, fail when the two low-order bits
// of item 8 are not both 1, i.e., require (state & 3u) == 3u.
//======================================================================
Here is a short program that verifies the bug described above.
It is a complete program, so you should be able to copy, paste, and compile without any need to fiddle. I had my compiler set to C++14.
#include <cassert>
#include <iostream>
#include <sstream>
#include <type_traits>
#include "pcg_random.hpp"
template< typename charT, typename traits >
bool check_whether_mcg_state_can_be_read_by_pcg32(
std::basic_ostream<charT, traits>& log)
{
using state_type = typename pcg32::state_type;
enum : state_type { zero, one };
pcg32 e;
pcg32_fast e_fast;
std::stringstream sst;
sst << e_fast;
sst >> e;
auto const pass1{ !sst.fail() };
assert(pass1);
sst.clear();
sst.str("");
sst << e;
state_type multiplier{}, increment{}, state{};
sst >> multiplier >> increment >> state;
auto const pass2{ increment == one };
assert(pass2);
log << "Can the state of pcg32_fast be read in by pcg32?"
"\nAnswer: " << (pass1 ? "yes\n\n" : "no\n\n");
if (pass1)
{
log << "Does that change the increment from 0 on pcg32_fast "
"to 1 on pcg32?"
"\nAnswer: " << (pass2 ? "yes\n\n" : "no\n\n");
}
return pass1 && pass2;
}
int main()
{
auto& log{ std::cout };
return check_whether_mcg_state_can_be_read_by_pcg32(log) ? 0 : 1;
}
// end file: main.cpp
thank-you @tbxfreeware, that was a very informative writeup.
I have a situation where I need to seed PCG from a random_device, perform some work and if later on if an investigation is needed to be able to generate the same randomness used for the given unit of work.
The following demo code does that and verifies the numbers generated are identical once the rng is reseeded - all that is good or at least seems "ok".
What I'm sort of confused with is the following:
Are the above two situations normal or expected behavior?
Example code: