For sufficiently large global volumes (such as MILC's 192^3x384 configurations) this can overflow leading to identical streams across different global lattice sites. The high-level solution is to promote this value to a uint64, but that will require further changes under the hood.
As part of initializing QUDA's hypercubic RNG, it needs to compute the global lattice index corresponding to each local site. This is currently accumulated into an
int32
: https://github.com/lattice/quda/blob/develop/include/kernels/random_init.cuh#L49For sufficiently large global volumes (such as MILC's 192^3x384 configurations) this can overflow leading to identical streams across different global lattice sites. The high-level solution is to promote this value to a uint64, but that will require further changes under the hood.