Closed AmosEgel closed 1 month ago
Update As a workaround, I am trying the following:
XorShift64Star
random number generator, using the i-th element of the seeds array as the seed, where i is the thread index. The kernel reads:
static void HistogramKernel(Index1D index, ArrayView1D<uint, Stride1D.Dense> seeds, int numNumbers, ArrayView1D<int, Stride1D.Dense> view)
{
XorShift64Star rng = new XorShift64Star(seeds[index]);
int numBins = (int)view.Length;
for (int i = 0; i < numNumbers; i++)
{
float rand = rng.NextFloat();
int histoIdx = (int)(numBins * rand);
if (histoIdx == numBins) histoIdx -= 1;
Atomic.Add(ref view[histoIdx], 1);
}
}
This strategy seems to do what I need. I still don't understand what I was doing wrong in the original attempt, though - so any explanation or correction would still be highly welcome.
By the way, I can imagine that the above workaround is not ideal regarding run time performance. However, that may not be a big problem, because the actual program that we want to run on the GPU does heavy computations, so the overhead of creating more RNG instances than necessary might not be significant in the end.
Hi @AmosEgel, welcome to the ILGPU community! I apologize for the delayed response - most of our team was off during the past weeks. Indeed, the RNG<T>
was never designed to be used from AutoGrouped
kernels. In order to address your use case you may want to take a look at ThreadWiseRNG
available in 2.0beta1. This should solve your problems.
Great, thanks a lot for the explanation and the hint to ThreadWiseRNG
.
Question
First of all, thanks for providing and maintaining this beautiful and extra-helpful package!
I try to generate a histogram of random numbers. It works well as long as the number of total threads does not exceed a certain limit. For large numbers of threads, the resulting histogram does not show an even distribution any longer. I suspect that I do something wrong in the way how I construct and use the random number generator. I would appreciate any hint to direct me what I am doing wrong.
Environment
Additional context
With the following kernel I try to write random numbers into a histogram:
The kernel is called with this method:
For
numThreads = 10000
andnumBins = 100
, I get the following result (pretty much as expected):However, when increasing the number of threads to e.g.
numThreads = 100000
, the result looks like this:The distribution is no longer even.