is the ptwalk assumed to be cached?

riscv / riscv-test-env

https://jira.riscv.org/browse/RVG-141

Other

42 stars 107 forks source link

is the ptwalk assumed to be cached? #11

Closed neelgala closed 5 years ago

neelgala commented 5 years ago

@aswaterman I don't think that the RISC-V spec mandates that hardware-page table walks should be cached. Am I right? If so, then any updates to the page-table in the vm.c should be followed by a fence to make sure that the subsequent page-walks see the updated entries in the page-table.

While I don't completely understand the working of the vm.c. But doing the following changes got the required result. I am sure there is a better solution to this than what is below.

Change-1:

sret in the following should be preceded by a fence https://github.com/riscv/riscv-test-env/blob/6f31769d091bbd2ef504308633fbe8a870641894/v/entry.S#L77

Change-2 sfence in the following should be followed by a fence

https://github.com/riscv/riscv-test-env/blob/6f31769d091bbd2ef504308633fbe8a870641894/v/vm.c#L25

aswaterman commented 5 years ago

Hi Neel! Since these are effectively* uniprocessor tests, no FENCE instructions should be necessary at all.

The raison d'etre of the SFENCE.VMA instruction is to order page-table updates with respect to subsequent virtual-memory accesses. After the page table is modified, it suffices to execute an SFENCE.VMA. In particular, the following sentence in the spec implies that no additional FENCE instruction is necessary on a uniprocessor: https://github.com/riscv/riscv-isa-manual/blob/master/src/supervisor.tex#L924

The coherence_torture routine does have other threads hammer on the memory system in innocuous ways, but it doesn't interact with the page tables, so is irrelevant here.

neelgala commented 5 years ago

Let me know if I got this right:

Let's consider a uniprocessor system and say I have a TLB and a PTWALK state-machine in my HW design. Now a TLB-miss would initiate a PTWalk. Since the PTWalk works with physical addresses it can directly fetch/load from the bus and can thus bypass the data-cache completely.

From the text you have pointed it would seem, for a such system an SFENCE.VMA would also require flushing the data-cache as well. Am I right in understanding this?

aswaterman commented 5 years ago

Yep - that is exactly the implication of the clause I linked to in the spec.

neelgala commented 5 years ago

Thanks.

aswaterman commented 5 years ago

BTW, the microarchitectural assumption is that the PTW talks to an L2$ that can invalidate from the L1$, or the PTW talks to the L1$. If they are truly incoherent, then yes, the SFENCE.VMA needs to flush the L1$.

It sounds like your implementation of FENCE is already doing this, so you already have all the pieces together, it's just some more control.