Closed henry-hsieh closed 1 year ago
The definition of fence.i in the Unpriv spec addresses your question (specifically the last sentence below):
The FENCE.I instruction is used to synchronize the instruction and data streams. RISC-V does not guarantee that stores to instruction memory will be made visible to instruction fetches on a RISC-V hart until that hart executes a FENCE.I instruction. A FENCE.I instruction ensures that a subsequent instruction fetch on a RISC-V hart will see any previous data stores already visible to the same RISC-V hart.
In a RISC-V hart compliant with this RISC-V instruction (i.e. that implements the instruction as specified), the CBO is never necessary and the FENCE.I is always sufficient.
Conversely, what you are positing is a non-compliant RISC-V hart (wrt its fence.i implementation) - in which case all bets are off.
Note that the new (still in development) I/D Consistency architecture extension will provide instructions to do what you want in a system without coherency between I and D (and hence presumably doesn't implement fence.i properly).
Thanks! I'm missing that the second fence.i
of my example should also flush the D-cache in the system without coherency between I and D. The cases should be updated to following.
cbo.inval
before fence.i
:
The instruction fetch will see the MEM[0x1000] = 0, because the value 1 only exists on data cache and is being invalidated.cbo.inval
after fence.i
:
The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is flushed out of data cache by fence.i
.cbo.clean
before fence.i
:
The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is written to shared memory by cbo.clean
.cbo.clean
after fence.i
:
The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is flushed out of data cache by fence.i
.cbo.flush
before fence.i
:
The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is written to shared memory by cbo.flush
.cbo.flush
after fence.i
:
The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is flushed out of data cache by fence.i
.The order between cbo.clean
or cbo.flush
and fence.i
is irrelevant. However, the order between cbo.inval
and fence.i
will affect the view of instruction fetch. I'm aware of that the initial purpose of cbo.inval
is to observe the data writes from non-coherent agents. Intentionally write a value then invalidate it is a little suspicious in software. Or it's simply that the CMO task group doesn't want to support CMO management instructions on instruction fetch (fence.i
) and table entry update (sfence.vma
) although the data may be altered by the CMO management instructions.
There is a separate arch extension (under the J TG) working to address I/D consistency in both I/D coherent and non-coherent systems - which is why the CMO TG focused on the three extensions it developed.
As far as your other questions, note again that a compatible implementation of fence.i ensures that a subsequent instruction fetch on a RISC-V hart will see any previous data stores already visible to the same RISC-V hart. This implies cache operatiosn on any data and instruction caches, as necessary, to satisfy this required property.
And as noted above, fence.i is always sufficient and no form of CBO is ever needed.
But if you are asking what happens to be the ordering, if any, between fence.i and maintenance CBOs, then the answer is as spec'ed, i.e. no ordering. Which is not a problem for managing I/D consistency give all the above since fence.i already takes care of whatever is needed to ensure that subsequent ifetches see all preceding writes.
Thank you, I get your point! There is no need using CBO on I/D consistency. The fence.i
could take care of everything.
Let's say that there is a system without coherency between I and D. Moreover, the D-cache using write-back policy. Give the following code example:
If
fence.i
can't order thecbo.*
instruction,cbo.*
may be executed afterfence.i
. This may create following different cases (assume no other harts access the address):cbo.inval
beforefence.i
: The instruction fetch will see the MEM[0x1000] = 0, because the value 1 only exists on data cache and is being invalidated.cbo.inval
afterfence.i
: The instruction fetch will see the MEM[0x1000] = 0, because the value 1 only exists on data cache.cbo.clean
beforefence.i
: The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is writing to shared memory.cbo.clean
afterfence.i
: The instruction fetch will see the MEM[0x1000] = 0 or 1, depends on whethercbo.clean
is earlier than instruction fetch.cbo.flush
beforefence.i
: The instruction fetch will see the MEM[0x1000] = 1, because the value 1 is writing to shared memory.cbo.flush
afterfence.i
: The instruction fetch will see the MEM[0x1000] = 0 or 1, depends on whethercbo.flush
is earlier than instruction fetch.In the case 4 or case 6, the instruction fetch may not observe the self writing value.