NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.85k stars 5.8k forks source link

x86: Zero upper bits of 64-bit registers on repeat/loop/string instructions #6600

Open Sleigh-InSPECtor opened 4 months ago

Sleigh-InSPECtor commented 4 months ago

Several instructions involving loop semantics are missing check_Reg operations when 32-bit registers are modified in long mode. This PR adds an appropriate check_Reg to all the places where 32-bits register are modified. (Here several commits have been merged into a single PR, because the fix is essentially the same for each issue).

For string operations, EDI and ESI need to be zeroed when the address size prefix (67) is set:

For REP instructions, ECX is used when the address size prefix (67) is set and needs to be zeroed:

Essentially, identical differences apply to all string instructions with repeat ops, e.g.:

A similar change is needed when the 67 prefix is present on LOOP instructions:

Also includes conditional LOOP instructions:

The LODSD instruction (no prefix required), writes to EAX and also requires a fix:


After the PR, there is still a minor discrepancy with Intel CPUs. Even when ECX=0 (no loop iterations), the test Intel CPU zeroes the upper bits of RAX. AMD's behavior makes more logical sense, for other instructions in long mode, zeroing the upper bits only occur when the register is used as a destination of an operation, and according to the pseudocode listed for REP instructions† RCX is not written to if it is zero, however this is what is observed running the instruction on real hardware:

†Intel SDM Vol.2B "REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix"