Should C.ADD(I) be C.CADD(I) in capability mode?

riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.

https://jira.riscv.org/browse/RVG-148

Creative Commons Attribution 4.0 International

37 stars 24 forks source link

Should C.ADD(I) be C.CADD(I) in capability mode? #307

Open arichardson opened 6 days ago

arichardson commented 6 days ago

There is no compressed pointer add other than the SP-relative ones in capability mode. This means e.g. pointer arithmetic loops have to use uncompressed instructions. Should we remap these instructions to the capability versions in capability mode?

In most cases it should be possible to use the capability op for an integer operation as long as the compiler knows the input is already guaranteed to be untagged. Unclear what percentage of instructions would still need to explicitly use the integer version.

The downside would be that we do need a representability check and have to use the CHERI ALU rather than the integer one for these instructions. Should not matter for simple cores but might be an issue for larger ones.

Would need to estimate how much code size savings could be made by this change. A simple approximation could be to always use c.cadd(i) for both integer and capability adds and compare that to the current code generation.

davidchisnall commented 6 days ago

The downside would be that we do need a representability check and have to use the CHERI ALU rather than the integer one for these instructions. Should not matter for simple cores but might be an issue for larger ones.

I would expect it to be the other way around. The power cost of a full adder and a representability check in a simple pipeline is fairly high relative to just zeroing the top half and doing the same add. In comparison to the power overhead of register rename on a larger pipeline, it's largely noise: even the add is basically free in comparison to all of the book keeping. This is why rich addressing modes are so important for scaling an ISA to high-performance cores: sticking an extra add or shift pipeline stage is almost free in comparison to scheduling two instructions. You may see a performance difference if you have an integer-add pipeline and a capability-add pipeline that has one extra stage for the representability check / bounds update. Sending things down that pipeline would add one cycle of latency (though you could still dispatch one per cycle).

tariqkurd-repo commented 5 days ago

This comes back to the original question about whether CADD[I] is distinct from ADD[I] in general. Because of the 12-bit immediate required for linker relaxation for CADDI I suspect this is something we'll end up merging when we finally get official encodings, as encoding space is so tight.

I also agree with David's point - the cost of renaming, dispatching, issuing and then accessing the enormous physical register file is very high in high performance implementations.

If we do merge them, then an optimised implementation can opt to do a second cycle of check if the source is a cap, and not if it's an integer, for example, by tracking the types of registers. So in the hardware I don't really see merging them as a high cost, it may be another matter in the compiler though.

tariqkurd-repo commented 5 days ago

I think any change like this should wait for ARC review.