milc-qcd / milc_qcd

MILC collaboration code for lattice QCD calculations
Other
34 stars 33 forks source link

signed integer overflow UB and integer conversion with clang #48

Open jxy opened 2 years ago

jxy commented 2 years ago

Optimization in clang will give you different sequence of random numbers on x86_64. I have it open at https://bugs.llvm.org/show_bug.cgi?id=52100 We'll see what the compiler developers say.

At the mean time, I have a brute force workaround for clang and icx.

diff --git a/generic/ranstuff.c b/generic/ranstuff.c
index dfd3816a..dc1e4e93 100644
--- a/generic/ranstuff.c
+++ b/generic/ranstuff.c
@@ -87,7 +87,7 @@ void initialize_prn(double_prn *prn_pt, int seed, int index) {
     seed = (69607+8*index)*seed+12345;
     prn_pt->r6 = (seed>>8) & 0xffffff;
     seed = (69607+8*index)*seed+12345;
-    prn_pt->ic_state = seed;
+    prn_pt->ic_state = seed & 0x80000000ULL ? 0xffffffff00000000ULL | seed : 0x00000000ffffffffULL & seed;
     prn_pt->multiplier = 100000005 + 8*index;
     prn_pt->addend = 12345;
     prn_pt->scale = 1.0/((Real)0x1000000);
jxy commented 2 years ago

Actually, if you cast the intermediate multiplications, you can avoid UB, and get back the RNG sequence you want. Perhaps this is the right fix.

diff --git a/generic/ranstuff.c b/generic/ranstuff.c
index dfd3816a..2fea6697 100644
--- a/generic/ranstuff.c
+++ b/generic/ranstuff.c
@@ -72,21 +72,21 @@ void initialize_prn(double_prn *prn_pt, int seed, int index) {
       node0_printf("Type long long less than 8 bytes on this machine. Rand problems?\n");
       exit(1);
     }
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r0 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r1 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r2 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r3 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r4 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r5 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->r6 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (long long)(69607+8*index)*seed+12345;
     prn_pt->ic_state = seed;
     prn_pt->multiplier = 100000005 + 8*index;
     prn_pt->addend = 12345;
maddyscientist commented 2 years ago

These look to be some of the same issues I found with address sanitizer on MILC: https://github.com/milc-qcd/milc_qcd/issues/37

jxy commented 2 years ago

Actually my last patch is still wrong, because long long still may overflow in that mult-add. It's much better just cast it into unsigned int.

diff --git a/generic/ranstuff.c b/generic/ranstuff.c
index dfd3816a..8ead7feb 100644
--- a/generic/ranstuff.c
+++ b/generic/ranstuff.c
@@ -72,21 +72,21 @@ void initialize_prn(double_prn *prn_pt, int seed, int index) {
       node0_printf("Type long long less than 8 bytes on this machine. Rand problems?\n");
       exit(1);
     }
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r0 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r1 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r2 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r3 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r4 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r5 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->r6 = (seed>>8) & 0xffffff;
-    seed = (69607+8*index)*seed+12345;
+    seed = (unsigned int)(69607+8*index)*seed+12345;
     prn_pt->ic_state = seed;
     prn_pt->multiplier = 100000005 + 8*index;
     prn_pt->addend = 12345;
maddyscientist commented 2 years ago

Does the simple cast to unsigned fix the underlying issue you're seeing @jxy ?

jxy commented 2 years ago

Yes. The unsigned ops preserve the original math modulo 2^32 and give the same random number sequences as gcc without the cast.