Closed Quuxplusone closed 9 years ago
Attached loadrace.zip
(32790 bytes, application/x-zip-compressed): source and IR files
Both functions have unsynchronized access to a and are thus racey.
In the source program
x = 1 => x.store(1,std::memory_order_seq_cst) and
x == 1 => x.load(std::memory_order_seq_cst) == 1
The R_sc(X,1) in readA() reads-from W_sc(X,1) in writeA() which results in
synchronization and W_na(a,42) happens-before R_na(a,42) and thus synchronized.
If (x==1) test fails then W(x) and R(x) is not synchronized and in that case
R_na(a) does not take place.
Thus the source program is race free.
(In reply to comment #2)
> The R_sc(X,1) in readA() reads-from W_sc(X,1) in writeA() which results in
> synchronization
That is not correct; no synchronization happens here.
> and W_na(a,42) happens-before R_na(a,42)
No, there is no happens-before relation here.
If you reopen this again, please provide an explanation of why you think there
is a happens-before relation here, referencing the rules in [intro.multithread]
in the C++ standard to justify the steps in your explanation.
Reference
----------
ISO/IEC 14882:2011 Programming Language C++
Link: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf
> The R_sc(X,1) in readA() reads-from W_sc(X,1) in writeA() which results in
> synchronization
>> That is not correct; no synchronization happens here.
29.3 page - 1116
From 1.
"memory_order_release, memory_order_acq_rel, and memory_order_seq_cst: a store
operation performs
a release operation on the affected memory location."
From 2.
"An atomic operation A that performs a release operation on an atomic object M
synchronizes with an atomic
operation B that performs an acquire operation on M and takes its value from
any side effect in the release
sequence headed by A."
Considering these the R_sc(x,1) in readA() reads-from W_sc(x,1) in writeA()
which results in synchronization.
> and W_na(a,42) happens-before R_na(a,42)
>> No, there is no happens-before relation here.
1.10 [intro.multithreaded] page 13-14
"
11 An evaluation A inter-thread happens before an evaluation B if
— A synchronizes with B, or
— A is dependency-ordered before B, or
— for some evaluation X
— A synchronizes with X and X is sequenced before B, or
— A is sequenced before X and X inter-thread happens before B, or
— A inter-thread happens before X and X inter-thread happens before B.
"
"
12 An evaluation A happens before an evaluation B if:
— A is sequenced before B, or
— A inter-thread happens before B
"
In our example x=1 and x==1 synchronizes(sw); hence x=1 happens-before(hb) x==1.
Also a=42; is sequenced before(sb) x=1 in writeA() and in readA() x==1 is
sequenced before r1=a.
Hence a=42 ->(sb) x=1 ->(sw) x==1 ->(sb) r1=a which means a=42 ->(hb) r1=a and
the program is not racy.
If you reopen this again, please provide an explanation of why you think there
is a happens-before relation here, referencing the rules in [intro.multithread]
in the C++ standard to justify the steps in your explanation.
Apologies for misreading / misunderstanding, and thanks for your persistence. Yes, assuming writeA is only called once and readA is called with flag == false, this code is race-free, and LLVM's transformation is incorrect.
I agree with comment#0 that it is the load-combining within Early CSE that broke the code: load-combining should notionally reorder the loads so they are adjacent before removing them, and doing so would move the second load of 'a' to before the load-acquire of x. Reordering a load before an acquire is not a permissible transformation.
(I don't think there is an important semantic gap between LLVM and the C++11 memory model here, there's just a bug in this pass.)
loadrace.zip
(32790 bytes, application/x-zip-compressed)