DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.61k stars 554 forks source link

64-bit Windows only preserves xmm0-5 but does not document this clearly #6332

Open yangyixiaof opened 11 months ago

yangyixiaof commented 11 months ago

Describe the bug For latestest version 10.0.0 and older version 9.0.0.1, I find the same bugs. Inside a clean call, when fetching values of SIMD or avx512 registers greater than xmm8 (e.g. xmm10), the value is obviously wrong, but for xmm1-xmm8, the value is right.

Here is the evidence. When I insert clean call before and after paddd instruction, I print values of each register before and after the execution of instruction. I do set the save_fp_state flag when insert that clean call.

Here is the wrong case for a trace. paddd xmm0,xmm11 before execution: xmm0:00005c0000805c0000005d0000805d00 xmm11:010000000000000048c139a4b9010000 after execution: xmm0:ff7fdc00ffffdc00ff7fdd00ffffdd00 the value of xmm11 is obviously wrong.

Here is the right case for same trace. paddd xmm2,xmm3 before execution: xmm2:bc000000bc000000bc000000bc000000 xmm3:00000000010000000200000003000000 after execution: xmm2:bc000000bd000000be000000bf000000 this execution is right!

Note that the two cases happen in same trace! After detailed check, the wrong reason is the dynamoRIO dr_mc_context cannot get the values of registers with id greater than xmm8!

The OS is latest Win10 at 2023-9-27, kernel version is 19045.3448, CPU is i5-8700. Please fix this bug. This seems urgent as it will cause wrong results for taint analysis or concolic execution tools based on DynamoRIO.

We provide the tested file and its input (click here to download) you can just write a simple client to monitor any instruction which uses xmm10 or xmm11 registers such as paddd or pslld. In that link, test_jpeg.exe depends on jpeg62.dll in the same directory, jpeg62.dll is also put in the link. you should put jpeg62.dll in the same directory as test_jpeg.exe and run command like 'drrun.exe -c your_client.dll -- test_jpeg.exe input.jpg'.

yangyixiaof commented 11 months ago

I answer my own question, it seems the implementation of dr_mc_context has limitations. Before inserting a clean call, DynamoRIO uses fxsave instruction to store mm, xmm registers, on some platforms, fxsave only stores xmm0-xmm7. It seems it is invalid to fetch ymm or zmm in a clean call through dr_mc_context, as fxsave instruction does not store ymm or zmm at all. Am I right? Correct me if I am wrong. I think I should insert movdqu instruction to fetch values of xmm, ymm or zmm registers. Besides, I want to know, how dynamoRIO stores common registers (eax, rax) onto stack? Where (which source file) can I find the related code? I think the documentation should specifically point out that the save_fp_state can only fetch xmm0-xmm7 registers and cannot fetch ymm or zmm registers if what I guess is right.

derekbruening commented 11 months ago

Generally, if you are not sure it's a bug, please start by asking on the users list https://groups.google.com/forum/#!forum/DynamoRIO-Users for questions rather than filing an issue, as it will reach a wider audience of people who might have an answer, and it will reach other users who may find the information beneficial.

It looks like DR has always preserved only xmm0-5 for 64-bit Windows (but all xmm registers for other platforms): https://github.com/DynamoRIO/dynamorio/blob/master/core/lib/globals_api.h#L734 No, this is has nothing to do with fxsave: probably it came from the ABI caller-saved register conventions (the app is not trusted, but compiled DR and client code would be trusted to obey the ABI). This is surprising: I would have thought that all platforms would preserve all registers at this point. It also does not seem to be very well documented. Let's leave this issue open to cover considering changing what's preserved or at the least better documented the current behavior.