DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.57k stars 552 forks source link

i#6662 public traces, part 1: synthetic ISA #6691

Closed edeiana closed 3 months ago

edeiana commented 4 months ago

A synthetic ISA that has the purpose of preserving register dependencies and giving hints on the type of operation an instruction performs. This PR implements the encoding/decoding functionalities for this new ISA, which we call #DR_ISA_REGDEPS.

Note that being a synthetic ISA, some routines that work on instructions coming from an actual ISA (such as #DR_ISA_AMD64) are not supported (e.g., decode_sizeof()).

Currently we support:

A #DR_ISA_REGDEPS #instr_t contains the following information:

Note that all routines that operate on #instr_t and #opnd_t are also supported for

DR_ISA_REGDEPS instructions. However, querying information outside of those

described above (e.g., the instruction opcode with instr_get_opcode()) will return the zeroed value set by instr_create() or instr_init() when the #instr_t was created (e.g., instr_get_opcode() would return OP_INVALID).

edeiana commented 3 months ago

The implementation is somewhat complete, and I believe I addressed all the previous feedback. What's still missing is: 1) What we want to do with https://github.com/DynamoRIO/dynamorio/pull/6691#discussion_r1531686478 (document or change the implementation?) 2) Giant enum for virtual registers. 3) More instructions to test in suite/test/api/ir_x86.c (and perhaps, moving those tests into a new suite/test/api/ir_synthetic.c?). 4) Figure out why suite/test/api/ir_x86.c fails only on windows 32 bit. (<--- figured it out, RAX is a 64-bit register, duh)

edeiana commented 3 months ago

"code_api|tool.drcachesim.invariants" is failing in "ci-x86/x86-64". From the wall of text I get it's hard to understand if it's timing out or actually failing. Will investigate either way using tmate (since I can't reproduce it locally). However, fixing it shouldn't change my work drastically.

Most of the additional work is on collapsing sub-registers into their containing register and adding the register size. This changed the encoding, specifically increasing the operand size from 1 byte to 2 bytes. Also, decoding (and encoding) to/from synthetic ISA is now enabled on all arches. Decoding gets its ISA from dcontext_t, while encoding from instr_t like other ISAs in DR. I added a new suite/test/api/ir_synthetic.c test as well (and removed testing of synthetic ISA from suite/test/api/ir_x86.c) Comments and description have been updated accordingly.

edeiana commented 3 months ago

Ready for review. The main "concern" I have is on the signature of instr_convert_to_isa_regdeps(). Should it be: 1) instr_t* instr_convert_to_isa_regdeps(void *dcontext, instr_t *instr_real_ISA), which calls instr_create() (hence making the user of this function responsible to free the returned instr_regdeps_ISA instruction) (<--- current version) or 2) void instr_convert_to_isa_regdeps(void *dcontext, instr_t *instr_real_ISA, instr_t *instr_regdeps_ISA), where instr_regdeps_ISA is allocated outside. ?