au-ts / sddf

A collection of interfaces, libraries and tools for writing device drivers for seL4 that allow accessing devices securely and with low overhead.
Other
18 stars 14 forks source link

Remove cache clean #128

Closed Courtney3141 closed 4 months ago

Courtney3141 commented 4 months ago

This pull request removes the additional cache_clean operation that was added in PR 73 in favour of always requiring that the receive DMA region be mapped into all address spaces as read only.

Since the Arp component has since been removed, and the Copy components and receive virtualisers are the only protection domains with the receive DMA region mapped in at all, all protection domains already had the receive DMA region mapped in read only prior to this PR, but I have added a few comments to indicate that this is now a necessity.

The cache_clean that is removed by this PR was only necessary to prevent against cached writes being written to the region after the DMA write had occurred, corrupting the newly received data. To prevent the reading of stale cached data, as well as erroneous pre-fetching, there is still an additional cache_clean_and_invalidate that is performed by the receive virtualiser before packets are read (this is a clean and invalidate rather than straight invalidate as clean and invalidate can be performed at user level, and subsequently ends up being more performant than a syscall invalidate).

Removing the additional clean prior to DMA means that the DMA region must only be mapped in read only. For clients using a copier (which is the typical case), this places no additional restrictions since copied data can be mapped in read-write. For clients without a copier (which therefore must be trusted) who wish to write to the DMA region, they must now copy data into writeable region themselves (which likely ends up more performant than the additional cache clean anyway).

This PR improves performance notably, with graphed results for the IMX8MM, Maaxboard and Odroidc4 pictured below:

CPU Util vs Requested Throughput

Odroidc4 CPU Util vs Requested Throughput

Further numbers can be found here https://docs.google.com/spreadsheets/d/1d1hKhZVVbEvxm7ehs7sXc1KvGjfdJ0RHR4YiMPzR8O8/edit?usp=sharing in the Recent-Single-Core tab.