DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.57k stars 550 forks source link

i#6760 AArch64: Use smaller data types for SVE P and FFR registers #6774

Closed jackgallagher-arm closed 2 months ago

jackgallagher-arm commented 2 months ago

PR #6757 fixed the way we read/write SVE register slots but unfortunately it is now broken on systems with 128-bit vector length.

Both SVE vector and predicate registers use dr_simd_t slots which is a 64-byte type meant to store up to 512-bit vector registers. SVE predicate registers are always 1/8 the size of the vector register so for 512-bit vector length systems we only really need 64 / 8 = 8 bytes to store predicate registers.

The ldr/str instructions we use to read and write the predicate register slots have a base+offset memory operand where the offset is a value in the range [-256, 255] scaled based by predicate register length. We read and write the registers by setting the base address to the address of the first slot, and setting the offset to n sizeof(dr_simd_t) for each register Pn. For systems with 128-bit vector length, this means the predicate registers are 16 / 8 = 2 bytes so the maximum offset we can reach is 2 255 = 510 bytes. This means on 128-bit VL systems we can only reach the first 8 predicate registers (8 * sizeof(dr_simd_t) = 512).

By changing the predicate register and FFR slots to use a new type dr_svep_t which is 1/8 the size of dr_simd_t we can fix this bug and save space.

dr_svep_t is currently 8 bytes to correspond to 64 byte vectors, but even if we extend DynamoRIO to support the maximum SVE vector length of 2048-bits (256 bytes) dr_svep_t will only need to be increased to 256 / 8 = 32 bytes so the maximum offset (15 * 32 = 480 bytes) will always be in range.

As this changes the size of the predicate register and FFR slots, this changes the size of the dr_mcontext_t structure and breaks backwards compatibility with earlier versions of DynamoRIO so the version number is increased to 10.90.

Issues: #6760, #5365 Fixes: #6760

jackgallagher-arm commented 2 months ago

vs2019-32 failure looks like #6764