imneme / pcg-cpp

PCG — C++ Implementation
Apache License 2.0
745 stars 99 forks source link

Results from separate seed() and set_stream() differ from two-argument seed() #91

Open rstub opened 7 months ago

rstub commented 7 months ago

When I seed and set the stream separately I get a different result than if I do that with a single call. Is that expected?

Example code:

#include <iostream>
#include "pcg_random.hpp"

int main(void) {
  pcg64 rng;
  rng.seed(20240409);
  std::cout << rng << std::endl;
  rng.set_stream(90404202);
  std::cout << rng << std::endl;
  rng.seed(20240409, 90404202);
  std::cout << rng << std::endl;
  return 0;
}

Output:

47026247687942121848144207491837523525 117397592171526113268558934119004209487 232695090976826000214660542883178056279
47026247687942121848144207491837523525 180808405 232695090976826000214660542883178056279
47026247687942121848144207491837523525 180808405 324486960932314774260885781527366674683

I would have expected the second and third line to be identical. While that is true for the stream indicator (second number), this is not the case for the RNG state (third number).

tbxfreeware commented 2 months ago

The behavior you observe is normal. It is caused by the "conditioning" process used to randomize the seeds you give to PCG.

Seeds are used to set the state of the linear congruential generator (LGC) that lies within PCG. The seeds, however, are not simply copied into the state variable. First, they are conditioned or stirred, a process which modifies them. The "conditioned" seed is what gets installed into the LCG.

pcg32 and pcg64, and any PCG engine except the "fast" ones (such as pcg32_fast and pcg64_fast), use the same conditioning equation. Effectively, it is this:

state_ = (seed + increment()) * multiplier() + increment();

But increment() is just the stream(), in disguise.

increment() == (stream() << 1) | 1u;

When you call rng.seed(20240409);, PCG stirs the seed using the stream that was set by the default constructor.

pcg64 rng;            // default constructor chooses "default" stream
rng.seed(20240409);   // function `seed` stirs, using the default stream

When you call rng.seed(20240409, 90404202);, PCG goes through a two-step process.

  1. First, it installs the new stream.
  2. Then, it stirs the seed, using the new stream, that was just installed.

This explains why the resulting states are different.

If you reverse the order, calling set_stream, before calling seed, then the two states should match. The only reason they don't is because of a bug described in Issue #94: With pcg32 and pcg64, the streams you select are discarded when you call either seed() or seed(itype). So, if you set the stream first, your setting will be discarded, when you set the seed.

rstub commented 2 months ago

Thanks for the explanation. I am surprised by the order of the processes taken when rng.seed(20240409, 90404202) is called. I would have expected the reverse order, i.e. first setting the seed and then the stream. The fact that the current stream is used to condition the seed does give an argument to the opposite behavior, though. It would be great if this could be clearly documented.

tbxfreeware commented 2 months ago

I am surprised by the order of the processes taken when rng.seed(20240409, 90404202) is called. I would have expected the reverse order, i.e. first setting the seed and then the stream.

There is a technical reason why the stream is set before the seed. It is because stream functions are implemented via a stream_mixin, which is a base class of a PCG engine. Before the body of the constructor of an engine is executed, all of its base-class sub-objects are completely constructed first. Stirring happens after that, in the constructor for the engine.

Now, you may be thinking, "What does construction have to do with calling function seed?"

The short answer is this: When you call one of the seed functions, PCG tears down the whole house, and builds a new one at the same address, using the "placement-new" syntax of operator new. When you call function seed, it calls the constructor, and lets the constructor handle the seeding.

For more detail, see Issue #94.