DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.63k stars 557 forks source link

SVE vector length default is invalid #6646

Open derekbruening opened 7 months ago

derekbruening commented 7 months ago

Filed from the discussion here: https://github.com/DynamoRIO/dynamorio/pull/6603#pullrequestreview-1873015418

The SVE dynamic vector length affects the decoding of SVE SIMD register and memory operands. For standalone decoder use, the question is how to set the vector length, since the decode target does not always match the underlying machine: xref #6585 where aarch64 drmemtrace traces are routinely decoded on x86 machines; plus there are many other use cases of using drdecode to decode raw bits obtained from other sources on other machines.

There are several other attributes like that that affect decoding. E.g., on x86, whether AMD or Intel, which affects several opcodes. There we used to have a VENDOR_UNKNOWN default which led to contradictory decoding results: #5725. We changed it to default to VENDOR_INTEL to have a sane default. Other examples include AArch32 Arm vs Thumb modes, cache line sizes for certain opcodes, default processor modes for operand size overrides on x86, etc.

Currently drdecode tries to support usage without any explicit initialization call, which is why the defaults have to be reasonable. However, we have run into problems with no explicit initialization: #2499 and #6002.

Today, the SVE vector length defaults to 0, which is invalid. If drdecode continues to try to be usable with no initialization, we cannot leave this invalid value as the default: we have to provide a reasonable default (say, 128).

I do not think requiring a call to something like dr_set_sve_vector_length from every single user before decoding any AArch64 code is reasonable or would be considered user-friendly at all. If something is required, IMHO it can only be a general initialization call which takes in some flags setting all the attributes (maybe a struct that's easy to extend): which would help with #2499, #6002.

derekbruening commented 7 months ago

Expanding on the usability argument: the decoder should not fail when the defaults are used. No user is going to want to call 5 different routines to set various things like SVE vector length, cache line size, vendor type, processor mode, etc. before decoding anything. The user likely will not know whether the target instruction needs any of those pieces of info. It is not a usable library model. A general init routine with all needed info collected into a struct which has defaults for everything is a lot more reasonable, as is trying to work without any init.

Better to have sane defaults and have the decode work and provide a legal decoding, even if it's not exactly what fits the user's target environment. The library can try to read values from the environment and current processor (only when it matches the target, which is only sometimes the case: look at our AArch64-on-x86 frequent use case), and fill in the rest from reasonable defaults. To get perfect decoding, the user will have to fill in info, but that is a rarer use case.

Consider other decoding tools: what if objdump -d just failed if you didn't pass 6 different flags setting everything? These tools all have defaults.