Closed RyanGlScott closed 2 years ago
Still to come is a comparison of profiling reports for large SAW proofs before and after this patch to compare the memory usage. In the meantime, this patch is in a suitable state for review.
Here is a heap profiling report for the same SAW proof as in https://github.com/GaloisInc/flexdis86/issues/40#issue-1241995531, but with the patch in #43 applied:
mkX64Disassembler
doesn't even show up in this report's top offenders, which is exciting. I think this is as good of a sign as any that this patch does in fact reduce the memory usage in practice.
Previously, the opcode lookup table would encode every possible permutation of allowable prefixes for each instruction as a separate path. This is expensive in both space and time, as observed in #40. The new approach taken in this patch, as described in
Note [x86_64 disassembly]
inFlexdis86.Disassembler
, is to only encode the VEX prefixe and opcode bytes in the lookup table, leaving out all other forms of prefix bytes entirely. Instead, disassembly will start by eagerly parsing as many prefix bytes as possible, proceeding to parse opcode bytes after the first non-prefix byte is encountered. After identifying the possible instructions from the opcode, we will then narrow down exactly which instruction it is by validating them against the set of parsed prefixes.As noted in
Note [x86_64 disassembly]
, we had to add some special cases fornop
-like instructions—namely,endbr32
,endbr64
,pause
, andxchg
—to avoid some prefix byte–related ambiguity. The new handling forxchg
is more accurate than it was before, so this patch fixes #42 as a side effect. This patch also addresses part (1) of #40 in that it should reduce the amount of memory that the lookup table uses, although there is potentially more work to be done (see part (2) of #40).