riscv / riscv-CMOs

https://jira.riscv.org/browse/RVG-59
Creative Commons Attribution 4.0 International
78 stars 12 forks source link

Invalidate Clean / do nothing to Dirty ?? (C-->I, D-->no change) #7

Closed AndyGlew closed 2 years ago

AndyGlew commented 3 years ago

BRIEF:

Should we provide an operation "Invalidate Clean // do nothing to Dirty "?

I have never seen anybody else do this, but I realized it has performance and security benefits as I was writing the proposal

DETAIL:

the standard CMOs are

writeback dirty data, leaving invalid behind; invalidate clean data [Intel WBINVD, Sometimes called FLUSH, in #6 JH suggests EVICT] -- Ri5 POR EVICT ?

write back dirty data, leaving clean behind; leaving clean data unaffected [IBM CLEAN, sometimes called FLUSH] - Ri5 POR CLEAN ?

invalidate both dirty and clean data, without bothering to write back the dirty data [IBM DISCARD, Intel INVD, FORGET by somebody else, INVAL by some previos poster in issue #6] - Ri5 POR DISCARD ?

invalidating dirty data without writing it back is, of course, a security hole .. but some folks argue strongly for it, so I will defer that to another issue.

When I was putting together the proposal I looked at all combinations and wondered about the combination "Invalidate Clean data // do nothing to Dirty data".

at first that seems strange and useless, but then I realized that it can be used in the following case of non-coherent I/O:

The usual patern for non-coherent I/O in the presence of speculation is

  1. Clean out any dirty data from the cache, so that it does not override any of the DMA writes that come next
  2. Signal/start the DMA
  3. I/O device does the DMA that writes to memory non-coherently
  4. Ensure that any stale clean data in the cache is invalidated, so that you see the DMA writes

(Step 4 is not necessary for processors that do not to speculation or prefetching. Suffice it to say that skipping step 4 is the cause of many bugs when such code is brought to a system that does speculation and prefetching.)

Step 1 - flush dirty data - can be done by Ri5 CLEAN or EVICT. it's probably a good idea to leave the lines invalid after this operation, but it's not necessary because step 4 does that.

Step 4 - invalidate stale clean data - can be done by Ri5 EVICT, or even by DISCARD/INVAL if your security model allows it. there should not be any need to evict/write back dirty data, since there should be no dirty data in the I/O DMA region to be evicted. if there were any dirty data in the region that the I/O device wrote to, that would be an error.

However... sometimes it is cheaper to do a whole cache operation rather than an address range operation (which might iterate per cache line operations over an address range larger than the cache). in this case, it is wasteful to write back dirty data encountered in step four, since if the hardware is correct and the software is correct such dirty data can only be for addresses outside of the DMA address range.

furthermore... invalidating clean data can be significantly cheaper in the hardware than evicting dirty data. flushing/evict dirty data pretty much requires a scan to find the dirty data, and then write it out on the bus one or a few cache lines at a time. invalidating clean data obviously doesn't require the bus cycles, but can also be implemented by one of the several techniques to do bulk invalidation, including from pulling down a line to clear all valid bits of a certain class together (e.g. separate clean-valid and dirty-valid bits so that you can do a bulk clean invalidate without affecting dirty data), to switching amongst a set of valid bits.

therefore, step 4 above could benefit from the operation "Invalidate-Clean//skip Dirty". better performance.

(using an unconditional INVAL/DISCARD/FORGET is unsatisfactory - the above code sequence could be used by somewhat unprivileged code ( if you allow those guys to access incoherent I/O or other processors), whereas discarding dirty data has much more stringent security requirements).

ISSUE: should we reserve and coding space (whether in instruction of indirectly in a CSR) to specify this "Invalidate-Clean//skip Dirty" operation in addition to the standard 3?

... I know, RISC-V is not supposed to innovate. but this looks like low hanging fruit and could be a competitive advantage.

allenjbaum commented 3 years ago

This isn't making sense to me. What are your assumptions about accesses to the DMA address range after step1? Can any CPU read or write it?

ingallsj commented 3 years ago

Going a step further: I would suggest doing this "Invalidate Clean & do nothing to Dirty" instead of INVALIDATE/DISCARD. My justification is to avoid software programming introduced bugs where dirty data is unintentionally invalidated by software.

Focusing on the non-coherent DMA use case:

Are there use cases where the proper code sequence does invalidate dirty data? If so, are they similar to the use cases for DCBA? If yes, then I'll make the same suggestion that I did there at https://github.com/riscv/riscv-CMOs/issues/5#issuecomment-692484234: If a specific vendor really wants Invalidate/Discard, then I suggest they implement that as a custom instruction, which the RISC-V ISA is happy to allow, but this is not something we should provide to all.

ingallsj commented 3 years ago

Now that I typed all that: should we include coherence-destroying INVALIDATE/DISCARD at all in the general-purpose ISA base extension, or should we tell programmers to use EVICT instead, as described elsewhere at https://github.com/riscv/riscv-CMOs/issues/11#issuecomment-686097010?

billhuffman commented 3 years ago

We can make some instructions part of a sub-option. That's a lot like letting them be a custom instruction in the sense that portable software won't use them. But it still provides a template for how to do it in a common way.

  Bill
dkruckemyer-ventana commented 2 years ago

Closing due to lack of discussion.