RV does not support LLVM atomicrmw yet (https://llvm.org/docs/LangRef.html#atomicrmw-instruction). Currently, RV's lane threads aren't considered concurrent threads in terms of the LLVM execution model and so atomicrmw remains scalar.
What needs to change: The result of atomicrmw is always a varying value. Otherwise, this is mostly a RV codegen issue (NatBuilder.cpp).
When the backend vectorizes an atomic instruction, it should apply the operator of the atomic (add, umin, umax, xor, ..) to reduce the value vector into a scalar value and emit just one atomicrmw with the reduced value.
What is tricky about atomicrmw is two things:
Fairness - who "wins" in a vector xchg? RV does not give any (lane)thread fairness or even liveness guarantees.
The result (vector) value - what will be the return vector value? The backend will need to emit a prefix-sum like operation over the reduced vector to simulate the incrementally updated value for each lane.
RV does not support LLVM atomicrmw yet (https://llvm.org/docs/LangRef.html#atomicrmw-instruction). Currently, RV's lane threads aren't considered concurrent threads in terms of the LLVM execution model and so
atomicrmw
remains scalar.What needs to change: The result of atomicrmw is always a varying value. Otherwise, this is mostly a RV codegen issue (NatBuilder.cpp).
When the backend vectorizes an atomic instruction, it should apply the operator of the atomic (add, umin, umax, xor, ..) to reduce the value vector into a scalar value and emit just one atomicrmw with the reduced value.
What is tricky about atomicrmw is two things:
xchg
? RV does not give any (lane)thread fairness or even liveness guarantees.