Open jacobly0 opened 1 week ago
Looks like this is related to some incorrectly cached analyses, only reproduces with -passes='default<O3>' but not with
passes=dse` and the module just before DSE
Ok just requires globals-aa
: https://llvm.godbolt.org/z/eehYaTKcc
As @sync
doesn't access @non_atomic
, I think the transformation is correct, but maybe @efriedma-quic or @nikic have additional thoughts?
I'll just point out this comment that suggests to me that the atomic store in @sync
should act as a barrier to the optimization.
https://github.com/llvm/llvm-project/blob/c46a95c147d8ba86980908353d377f9e2f9f5641/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp#L22-L23
Hm right, putting the atomic store in place of sync, it gets blocked. I guess globals-aa may need to be more conservative or we need to check no-sync for calls.
is this can be fixed if we add @sync to DSEBarrier?
A store is deleted even though it happens-before a read of that value in another thread which happens-before the killer store, through the following happens-before chain:
store i8 1, ptr @non_atomic
in@thread_a
(deleted store)store atomic i8 1, ptr @atomic seq_cst
in@thread_a
→@sync
%2 = load atomic i8, ptr @atomic seq_cst
that loads1
in@thread_b
%5 = load i8, ptr @non_atomic
in@thread_b
(misoptimization loads0
instead of1
)store atomic i8 0, ptr @atomic seq_cst
in@thread_b
%2 = load atomic i8, ptr @atomic seq_cst
that loads0
in@thread_a
→@sync
store i8 2, ptr @non_atomic
in@thread_a
(killer store)