For instance, say we convince the compiler to emit this logic:
initial state: x = 0, y = 1
THREAD 1 THREAD2
y = 3; if x == 1 {
x = 1; y *= 2;
}
Ideally this program has 2 possible final states:
y = 3: (thread 2 did the check before thread 1 completed)
y = 6: (thread 2 did the check after thread 1 completed)
However there's a third potential state that the hardware enables:
y = 2: (thread 2 saw x = 1, but not y = 3, and then overwrote y = 3)
I think the case of y=2 is impossible.
Assuming core 1 executes thread 1, core 2 executes thread 2.
In the case of y=2, if thread 2 saw x = 1, we can conclude that thread 1 must have execute y=3, I think this means it write y=3 to the cache line that contains y and invalidate the same cache line. Then, if thread 2 read y, it's cache line had beed invalidated, so it requires the cache line of y form core 1 which already have y=3, the result y is 6 other than 2.
I am not sure what I think is correct and kown a little about cache consistency protocol.
In the chapter of atomics, it says,
I think the case of
y=2
is impossible.Assuming core 1 executes thread 1, core 2 executes thread 2. In the case of
y=2
, if thread 2 sawx = 1
, we can conclude that thread 1 must have executey=3
, I think this means it write y=3 to the cache line that contains y and invalidate the same cache line. Then, if thread 2 read y, it's cache line had beed invalidated, so it requires the cache line of y form core 1 which already have y=3, the result y is 6 other than 2.I am not sure what I think is correct and kown a little about cache consistency protocol.