mfasiolo / mvnfast

Fast methods for multivariate normal distributions
http://mfasiolo.github.io/mvnfast/
33 stars 9 forks source link

Seed generation issue? #7

Open mattfidler opened 6 years ago

mattfidler commented 6 years ago

I was looking at your code, and noticed

NumericVector seeds = runif(ncores, 1.0, std::numeric_limits<uint32_t>::max());

Have you considered https://www.johndcook.com/blog/2016/01/29/random-number-generator-seed-mistakes/

Perhaps instead:

NumericVector seeds(ncores);
seeds[ncores-1] = runif(1, 1.0, std::numeric_limits<uint32_t>::max());
for (unsigned int j = ncores-1; j--;){
   seeds[j] = seeds[j+1]-1;
   if (seeds[j] == 0) seeds[j] = std::numeric_limits<uint32_t>::max()-1;
}

While it may not be quite as fast as your solution, it can avoid the birthday problem

mattfidler commented 6 years ago

This would reduce the probability of having similar seeds for each core, though with the size of the uint32 and the number of cores people typically have, this still would be low.

mfasiolo commented 6 years ago

Hi Matt, thanks for pointing this out. I wonder whether the best option would be depending directly on the sitmo package which might be setting the seed in a more sophisticated way (I haven't checked). If we believe the first approximation to the birthday problem given in Wikipedia, than the probability of a collision when using a 1000 cores is around

1 - exp(-1000^2/(2*(2^32)))
[1] 0.0001164085

with a more realistic 64 cores we have O(10^-7) probability of collision. I'll think about it.

mattfidler commented 6 years ago

I looked at the examples in the sitmo package and they have all the seeds specified by the user for each core. I don't think they actually do anything for this problem. I could be wrong.

mattfidler commented 6 years ago

As I said, the probability is low. However, the probability with 1000 cores using the sequential solution is 0.

mattfidler commented 6 years ago

Wouldn't the probabily increase as you make multiple calls to the mvnfast? It generates a seed from the uniform distribution every time it starts the mvnfast number generation. Still be low, but it isn't only based on the number of threads.

mfasiolo commented 6 years ago

Hi Matt,

ok, hopefully this 5094d5a25f00f505e0f472b6d2990a1f218db099 addresses your concerns.

mattfidler commented 6 years ago

Indeed it is better. Now it only depends on how many times you call mvnfast.

The only way to completely overcome it is to create your own seed setting routine or reading the R seed's state and using the R seed (I think that would work...?)

According to https://github.com/wch/r-source/blob/3105967d568ce6e411c63a0cbecabaec695b5cc0/src/library/base/man/Random-user.Rd

You could read the seed state by:

user_unif_nseed

and using the R's seed as sitmo's seed. You have to advance this seed somehow probably by a throw-away uniform random number.

mattfidler commented 6 years ago

If you are interested in this approach, it wouldn't be too hard to adapt...?

mattfidler commented 6 years ago

Also, I have heard that the construct

for (unsigned int j = ncores-1; j--;){
   seeds[j] = seeds[j+1]-1;
   if (seeds[j] == 0) seeds[j] = std::numeric_limits<uint32_t>::max()-1;
}

Takes slightly less operations than a full for loop, and therefore should be slightly faster. On modern machines, it shouldn't matter too much, but that is why I chose this loop style.

mattfidler commented 6 years ago

Now the reference:

https://www.thegeekstuff.com/2015/01/c-cpp-code-optimization/

Item 7 talks about the for-loop optimization above.

rstub commented 1 year ago

Interesting problem. One alternative could be to use the same seed value on all cores but pre-set the internal counter. For example, in dqrng I use

template<>
inline void random_64bit_wrapper<sitmo::threefry_20_64>::seed(result_type seed, result_type stream) {
    gen.seed(seed);
    gen.set_counter(0, 0, 0, stream);
    cache = false;
}

This way I get 2^64 streams with 3 64bit integers per-stream for counting.

mattfidler commented 1 year ago

I do something similar for parallel solving in rxode2