CTSRD-CHERI / cheri-specification

CHERI ISA Specification
Other
20 stars 6 forks source link

Expand c.mv to CMove #63

Open tariqkurd-repo opened 1 year ago

tariqkurd-repo commented 1 year ago

The most frequently compiler generated instruction in RISC-V is c.mv. In CHERI code, this can continue to be used for integer data types, but not for capability types as it strips metadata.

c.mv is very expensive encoding as it has two full 5-bit register specifiers (so uses 1024 code-points), and so being able to allocate enough encoding space for adding a 16-bit capability move is unlikely.

Therefore the proposal is to expand c.mv to CMove.

Clearly this changes the CHERI semantics compared to mv. The result being that if a capability is converted to integer value then (e.g.) the 32-bit encoding of mv should be used (which will strip the metadata as it executes as addi rd, rs1, 0).

The compiler will need to know the datatypes and do the necessary type conversion.

bsdjhb commented 1 year ago

I have been working on updating the section on compressed instructions in the ISA spec to reflect @jrtc27 's earlier changes in this area in the toolchain, QEMU, etc. While writing that, I did look at whether C.MV should map to MV vs CMove in capability mode. Some simple numbers I have suggest that CMove is indeed more prevalent in pure capability code than MV. For the CheriBSD pure capability kernel:

llvm-objdump -d -M no-aliases --no-show-raw-insn --no-leading-addr ~/work/cheri/output/rootfs-riscv64-purecap/boot/kernel.CHERI-PURECAP-QEMU/kernel | awk '/^ / { print $1 }' | sort | uniq -c | egrep 'mv|move'
38039 c.mv
145890 cmove

And for the CheriBSD libc.so.7 (C runtime library):

> ~/work/cheri/output/sdk/bin/llvm-objdump -d -M no-aliases --no-show-raw-insn --no-leading-addr ~/work/cheri/output/rootfs-riscv64-purecap/lib/libc.so.7 | awk '/^ / { print $1 }' | sort | uniq -c | egrep 'mv|move'
8030 c.mv
24690 cmove
  29 fmv.d.x
   3 fmv.w.x
  26 fmv.x.d
   4 fmv.x.w