riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
https://jira.riscv.org/browse/RVG-148
Creative Commons Attribution 4.0 International
52 stars 29 forks source link

SCBNDSI redundancy? #402

Open nwf opened 1 month ago

nwf commented 1 month ago

SCBNDSI has a 5-bit immediate, uimm, and a gated (by s) shift-by-4 operation, in https://github.com/riscv/riscv-cheri/blob/1c43ce2fe5688057d6108a5f901574f2dac0acd0/src/insns/scbnds_32bit.adoc?plain=1#L43 which means that there are two ways to request bound lengths of 0 (unshifed or shifted 0) and 16 (unshifed 16, shifted 1). Perhaps "shifted 0" and "shifted 1" could be put to better use (as requests for 512 and 528, say)?

arichardson commented 4 weeks ago

Avoiding redundant encodings here would be nice and I agree that having shifted zero be 512 would make a lot of sense here. Not sure how much that increases decoder complexity, but if we are already special casing, maybe we could also have shifted 1 be something more useful like 1024 or 4096?

davidchisnall commented 4 weeks ago

I'd like to see some distribution of object sizes before thinking about adding that. And it will probably be different on 32- and 64-bit systems. It might be worth marking those as explicitly invalid encodings that are reserved for future expansion.

arichardson commented 4 weeks ago

I'd like to see some distribution of object sizes before thinking about adding that. And it will probably be different on 32- and 64-bit systems. It might be worth marking those as explicitly invalid encodings that are reserved for future expansion.

That is a good argument - I believe @jonwoodruff did some analysis here. For now I've opened a PR to reserve these encodings.

jonwoodruff commented 4 weeks ago

These are data samples from Spec2006, averaging distributions from each of 10 benchmarks.

This first set is for flat "bits of precision". It starts at 1 bit because we assume signed values. 12 reaches 100% because we're just measuring the used sizes in RISC-V code generation from CHERI ISA-v9, which only has a 12-bit immediate.

  1 2 3 4 5 6 7 8 9 10 11 12
incoffseti 0.14% 41.46% 47.54% 53.55% 62.63% 69.90% 72.67% 76.13% 87.26% 97.36% 98.02% 100.00%

And here are results with a "scale bit" that shifts by 4, which was the point that hit the highest number of cases. Presumably this is because it facilitates pointer-aligned arithmetic. Unsurprisingly, this matches choices in the Morello ISA.

  1 2 3 4 5 6 7 8 9 10 11 12
incoffseti 0.14% 40.06% 49.83% 66.03% 72.97% 85.50% 96.57% 97.22% 98.09% 98.29% 98.53% 100.00%

I don't know if this informs how effective adding an additional value, e.g. 512, to the spectrum. I can dig up the old scripts and add that condition in. There is at most 10% on the table (the jump from 9 bits to 10 bits; we would get a single value from the 10-bit space, but probably the most common one).

arichardson commented 3 weeks ago

Just to clarify, it looks like this is the data for incoffset? I would imagine the values used for setbounds are somewhat different.

tariqkurd-repo commented 3 weeks ago

https://github.com/riscv/riscv-cheri/pull/418 is merged, so can we close this now?

arichardson commented 3 weeks ago

I agree the immediate issue is fixed, but maybe we should keep this open as a future enhancement to allow for larger values?