jbush001 / NyuziToolchain

Port of LLVM/Clang C compiler to Nyuzi parallel processor architecture
Other
62 stars 28 forks source link

__sync_lock_release generates incorrect code #97

Closed jbush001 closed 6 years ago

jbush001 commented 6 years ago

It generates this, which will not modify the value.

.LBB0_1: 
    load_sync s1, (s0)
    move s2, s1
    store_sync s1, (s0)  
    bz s1, .LBB0_1
jbush001 commented 6 years ago

The LLVM IR is:

  store atomic i32 0, i32* %lk release, align 4
  ret i32 undef
jbush001 commented 6 years ago

Because the action for ISD::ATOMIC_STORE is expand, it gets turned into an AtomicSwap:

Legalizing: t4: ch = AtomicStore<Volatile ST4[%lk]> t0, t2, Constant:i32<0>
Trying to expand node
Succesfully expanded node
 ... replacing: t4: ch = AtomicStore<Volatile ST4[%lk]> t0, t2, Constant:i32<0>
     with:      t9: i32,ch = AtomicSwap<Volatile ST4[%lk]> t0, t2, Constant:i32<0>

This has a custom emitter, which calls the common EmitAtomicBinary.

  case Nyuzi::ATOMIC_SWAP:
    return EmitAtomicBinary(MI, BB, 0);

When the third parameter is zero, this doesn't emit the swap, but erroneously just copies the old value:

  if (Opcode != 0) {
    ....
  } else
    NewValue = OldValue; // This is just swap: use old value
jbush001 commented 6 years ago

Can reproduce issue by fixing test:

--- a/test/CodeGen/Nyuzi/atomics.ll
+++ b/test/CodeGen/Nyuzi/atomics.ll
@@ -127,8 +127,9 @@ define i32 @atomic_xchg(i32* %ptr, i32 %value) { ; CHECK-LABEL: atomic_xchg:

   ; CHECK: load_sync [[OLDVAL:s[0-9]+]], (s0)
   ; CHECK: move s{{[0-9]+}}, [[OLDVAL]]
-  ; CHECK: store_sync [[OLDVAL]], (s0)
-  ; CHECK: bz [[OLDVAL]],
+  ; CHECK: move [[NEWVAL:s[0-9]+]], s1
+  ; CHECK: store_sync [[NEWVAL]], (s0)
+  ; CHECK: bz [[NEWVAL]],

   ret i32 %tmp
 }
jbush001 commented 6 years ago

Also, using load/store sync is overkill for this. An store/membar would do the job more efficiently.

jbush001 commented 6 years ago

I didn't do the work to make this just be a store. I'll file another bug for that.