DynamoRIO / drmemory

Memory Debugger for Windows, Linux, Mac, and Android
Other
2.41k stars 256 forks source link

shrink jmp-to-slowpath sequence #166

Open derekbruening opened 9 years ago

derekbruening commented 9 years ago

From derek.br...@gmail.com on December 10, 2010 17:58:11

PR 494769

to shrink the jmp to slowpath sequence:

don't store return pc and derive it from app pc. share single jmp-to-slowpath for whole bb that stores start pc of fragment, and slowpath decodes forward until find mov-immed that matches app pc. then return pc is after the jmp after that mov-immed. but, can't have custom regs holding app pc since shared jmp-to-slowpath needs to use just one: so don't do this opt until also have whole-bb stolen reg (which we have now: PR 489221).

try to use jmp-short to shared code: if can't reach then use relay spot, or duplicated shared code: should save space if > 3 jmp-slowpaths, which should happen if >128 bytes of code.

additionally: don't store app pc, instead store offs from tag (=> 2-byte mov_imm). to find tag, either have 1st slowpath entry store full app pc (== tag) (but then how tell from offs, since off must go to 8-bit sub-reg?), or have the shared per-bb slowpath entry store it into tls.

alternative: store tag instead of cache start pc, to support traces. then need to get start pc from tag via DR (have to add API routine: and disrupt our philosophy of hiding cache).

alternative ideas:

PR 494769: shrink jmp-to-slowpath sequence, part 1

Server: perforce-panda.eng.vmware.com:1985

PR 494769: shrink jmp-to-slowpath sequence, Part 2 Not yet on by default but just about all there.

Goal is to shrink this: 0x1f845076 c7 c1 b0 f4 a4 00 mov $0x00a4f4b0 -> %ecx 0x1f84507c c7 c2 87 50 84 1f mov $0x1f845087 -> %edx 0x1f845082 e9 72 50 12 00 jmp $0x1f96a0f9

Optimization #1 (simple):

Optimization #2 (complex):

Approach:

List of changes:

Not enabled by default because of this hole in the implementation: to handle selfmod and other situations where I can't predict for sure whether an app instr will remain unmangled I need to add DR support. My plan is to implement issue #156/PR 306163 and add post-mangling events for both bb and traces.

Original issue: http://code.google.com/p/drmemory/issues/detail?id=166

derekbruening commented 9 years ago

From bruen...@google.com on March 31, 2011 10:25:57

other alternative ideas from my notes: