DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.63k stars 558 forks source link

tracing tools should not include loads/stores skipped due to x86 conditional execution or synch primitives #2577

Open derekbruening opened 7 years ago

derekbruening commented 7 years ago

Today drmemtrace inside drcachesim predicates its data trace instrumentation for app loads and stores that are predicated on ARM, but it does not do so for x86 for instructions like OP_cmovcc and OP_bs{r,f}. This issue covers addressing that in drmemtrace, as well as in the sample tracing clients.

Other x86 "predicated" loads/stores will be more complex to handle: OP_getsec, OP_xend, OP_vpmaskmov{d,q}, OPvpmaskmovp{s,d}, OP{,v}maskmovdqu.

We can ignore OP_fmovcc as it does not touch memory.

derekbruening commented 7 years ago

More instructions to handle (and these need to have the pred flag added to the table): OP_cmpxchg{,8b}

derekbruening commented 7 years ago

My notes on cmpxchg:

*** TODO what about OP_cmpxchg*?!

Either loads or stores.  Looks like the table models this as *both* a load
and a store: so we did not consider it when we removed the other
conditional stuff.  To handle: want to check whether xax changed afterward,
so need post-instru.

But for offline, how does raw2trace know whether it was the load or the
store?  There will just be one address.  Have to include extra info?

ARM exclusive monitor operations have similar issues and we'll consider them under this issue as well.

derekbruening commented 7 years ago

Xref #2584

derekbruening commented 7 years ago

We should consider how to automate whatever solution we come up with for the complex predicates -- and for the simple ones as well since the #1723 auto-predication feature is ARM-only.

abhinav92003 commented 4 years ago

It's relevant to note that some memory accesses issued by cmovcc and cmpxchg are unconditional. Following are some Intel ISA references and code snippets that show this behaviour.

For cmovcc: The pseudocode at [1] indicates that the source is unconditionally loaded from memory. This is demonstrated by the following snippet where, for an invalid source address, cmovne fails even when ZF is set.

.text
_start:
    mov      $0, %rax
    cmp      $0, %rax
    cmovne    (%rax), %rbx

Program received signal SIGSEGV, Segmentation fault.
_start () at as.s:7
7       cmovne    (%rax), %rbx
(gdb) i r
...
eflags         0x10246             [ PF ZF IF RF ]

For cmpxchg: The pseudocode at [2] shows a complex handling where the destination is written with different values based on the condition, but written nonetheless. This is demonstrated by the following snippet where, for a read-only destination, cmpxchg fails even when %eax != (dest).

.text
_start: nop
    lea .,%ebx
    mov $1, %eax
    cmpxchg  %ecx,(%ebx)

Program received signal SIGSEGV, Segmentation fault.
_start () at ad.s:10
10      cmpxchg  %ecx,(%ebx)
(gdb) i r
rax            0x1                 1
rbx            0x401001            4198401
...
(gdb) x/1dw 0x401001 
0x401001 <_start+1>:    19209357

In both these cases, it's more accurate if the trace continues to have these unconditional memory references.

[1]: Intel Instruction Set Reference cmovcc, Vol. 2A 3-151 https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf [2]: Intel Instruction Set Reference cmpxchg, Vol. 2A 3-181 https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf

abhinav92003 commented 4 years ago

About OP_bs{r,f}, I see that the source operand (which may be a register or memory location) is unconditionally read, and the destination is written only if the source was non-zero. However, the destination is always a register[1]. So, in this case as well, we don't have any conditional memory accesses.

[1]: Intel Instruction Set Reference, 3-108 Vol. 2A, 3-110 Vol. 2A https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf

abhinav92003 commented 4 years ago

About OP_vpmaskmov, it indeed has conditional loads and stores[1]. When the mask is zero, we don't have any memory operations, and there's no crash. So this will need to be handled appropriately.

.text
_start: 
    xorps      %xmm1 , %xmm1
    mov        $0    , %ebx
    vpmaskmovq (%ebx), %xmm1,%xmm0 # load

.text
_start: 
    xorps      %xmm1, %xmm1
    mov         $0  , %ebx
    vpmaskmovq %xmm0, %xmm1, (%ebx) # store

[1]: Intel Instruction Set Reference, Vol. 2C 5-399 https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf

derekbruening commented 2 years ago

We have a separate issue on masked moves: #5197. Since vpmaskmov is the only thing left here let's close this and let the more-targeted #5197 cover it and the other masked move opcodes.

derekbruening commented 2 years ago

Actually there are still OP_getsec and OP_xend here.