Open jxy opened 2 years ago
Actually, if you cast the intermediate multiplications, you can avoid UB, and get back the RNG sequence you want. Perhaps this is the right fix.
diff --git a/generic/ranstuff.c b/generic/ranstuff.c
index dfd3816a..2fea6697 100644
--- a/generic/ranstuff.c
+++ b/generic/ranstuff.c
@@ -72,21 +72,21 @@ void initialize_prn(double_prn *prn_pt, int seed, int index) {
node0_printf("Type long long less than 8 bytes on this machine. Rand problems?\n");
exit(1);
}
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r0 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r1 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r2 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r3 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r4 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r5 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->r6 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (long long)(69607+8*index)*seed+12345;
prn_pt->ic_state = seed;
prn_pt->multiplier = 100000005 + 8*index;
prn_pt->addend = 12345;
These look to be some of the same issues I found with address sanitizer on MILC: https://github.com/milc-qcd/milc_qcd/issues/37
Actually my last patch is still wrong, because long long still may overflow in that mult-add. It's much better just cast it into unsigned int.
diff --git a/generic/ranstuff.c b/generic/ranstuff.c
index dfd3816a..8ead7feb 100644
--- a/generic/ranstuff.c
+++ b/generic/ranstuff.c
@@ -72,21 +72,21 @@ void initialize_prn(double_prn *prn_pt, int seed, int index) {
node0_printf("Type long long less than 8 bytes on this machine. Rand problems?\n");
exit(1);
}
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r0 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r1 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r2 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r3 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r4 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r5 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->r6 = (seed>>8) & 0xffffff;
- seed = (69607+8*index)*seed+12345;
+ seed = (unsigned int)(69607+8*index)*seed+12345;
prn_pt->ic_state = seed;
prn_pt->multiplier = 100000005 + 8*index;
prn_pt->addend = 12345;
Does the simple cast to unsigned fix the underlying issue you're seeing @jxy ?
Yes. The unsigned ops preserve the original math modulo 2^32 and give the same random number sequences as gcc without the cast.
Optimization in clang will give you different sequence of random numbers on x86_64. I have it open at https://bugs.llvm.org/show_bug.cgi?id=52100 We'll see what the compiler developers say.
At the mean time, I have a brute force workaround for clang and icx.