DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.66k stars 562 forks source link

On newer processors, use BMI2 eflags-less mask and shift to avoid eflags saving in IBL #4686

Open derekbruening opened 3 years ago

derekbruening commented 3 years ago

This is a proposal to implement eflags-less hashtable lookup, thus avoiding spilling and restoring eflags, by using BMI2 instructions on processors that support them: SHLX for the shift (though a series of LEA could also be used) and SHLX plus SHRX (or PEXT or PDEP) for the mask (unfortunately BZHI, tailor-made for this type of masking, modifies eflags), and LEA plus JECXZ for the cmp. (Actually the SHRX or PDEP can be merged with the scaling SHLX.) We would then load the table base into a register for a LEA, which may need an extra scratch register vs the add-memory we have now? The mask/shift also needs a register (the same one should work) since none of those take an immediate: so we're only saving the flags ops themselves, and not the xax spill/restore.

We can use -unsafe_ignore_eflags to help estimate some of the potential savings but it should not be hard to implement the final scheme and try to measure the impact.

derekbruening commented 3 years ago

While the IBL is never persisted, there are complications with persisted code ib prefixes if re-run on old processor after generating on a new one. The simplest thing is to disable any new scheme if persistence is requested.