Open hgreving2304 opened 5 years ago
The current translation for a PC in a mangling region revolves around re-executing the app instr by undoing all the state changes from the mangling code up to the PC. This does break down for this post-app-instr mangling like this rip-rel restore, where we can't undo the app instr in general.
IMHO a clean solution would either be: A) identify post-app-instr mangling ("epilogue") and fail (this can only be an asynch xl8); or B) identify epilogue and ignore all the pre-app-instr state and do a new walk through the epilogue state, not to undo it like we do w/ pre-state but to emulate it
The PR #3318 is not doing A or B but instead undoing using the existing pre-app-instr spill state, and limiting is uses to just non-other-mangled rip-rel (b/c the post-instr restore for sthg like a rip-rel indirect call should be undone), and checking the assumptions that the pre spill state matches the post restore state.
What rip-rel indirect call looks like:
interp: start_pc = 0x00007ffd17f57ac3
check_thread_vm_area: pc = 0x00007ffd17f57ac3
prepend_entry_to_fraglist: putting fragment @0x00007ffd17f57ac3 (shared) on vmarea 0x00007ffd17ef1000-0x00007ffd18008000
0x00007ffd17f57ac3 4c 8b c2 mov %rdx -> %r8
0x00007ffd17f57ac6 48 8b d1 mov %rcx -> %rdx
0x00007ffd17f57ac9 33 c9 xor %ecx %ecx -> %ecx
wrote all 6 flags now!
0x00007ffd17f57acb ff 15 2f 45 11 00 call <rel> 0x00007ffd1806c000[8byte] %rsp -> %rsp 0xfffffff8(%rsp)[8byte]
mbr exit target = 0x00007ff5ab2a39c0
end_pc = 0x00007ffd17f57ad1
-----------------
rip-rel call. probably if you make one in asm you can repro
-------------------
recreate_app : pc is in F2(0x00007ffd17f57ac3)
ilist for recreation:
TAG 0x00007ffd17f57ac3
+0 L3 4c 8b c2 mov %rdx -> %r8
+3 L3 48 8b d1 mov %rcx -> %rdx
+6 L3 33 c9 xor %ecx %ecx -> %ecx
+8 m4 @0x00007ff5ab325530 65 48 a3 20 16 00 00 mov %rax -> %gs:0x00001620[8byte]
00 00 00 00
+19 m4 @0x00007ff5ab326d18 48 b8 00 c0 06 18 fd mov $0x00007ffd1806c000 -> %rax
7f 00 00
+29 m4 @0x00007ff5ab326898 65 48 89 0c 25 30 16 mov %rcx -> %gs:0x00001630[8byte]
00 00
+38 L3 48 8b 08 mov (%rax)[8byte] -> %rcx
+41 m4 @0x00007ff5ab325d40 65 48 a1 20 16 00 00 mov %gs:0x00001620[8byte] -> %rax
00 00 00 00
+52 m4 @0x00007ff5ab3269d0 68 d1 7a f5 17 push $0x17f57ad1 %rsp -> %rsp 0xfffffff8(%rsp)[8byte]
+57 m4 @0x00007ff5ab325260 c7 44 24 04 fd 7f 00 mov $0x00007ffd -> 0x04(%rsp)[4byte]
00
+65 L4 @0x00007ff5ab326550 e9 9b 57 f8 ff jmp $0x00007ff5ab2a39c0 <shared_bb_ibl_indcall>
END 0x00007ffd17f57ac3
I am seeing the xl8 PC wrong when an asynchronous signal hits in a mangled region but after the app's mangled instruction. In my case, this always happens on a restore that is part of the mangled region.
In my case, this is an rip-rel mangled region
0x00005633d5d2f4b7 65 48 a3 00 00 00 00 mov %rax -> %gs:0x00[8byte] 00 00 00 00 0x00005633d5d2f4c2 48 b8 d0 d9 41 6b cb mov $0x00007fcb6b41d9d0 -> %rax 7f 00 00 0x00005633d5d2f4cc c4 62 e9 9d 00 vfnmadd132sd %xmm2[8byte] (%rax)[8byte] %xmm8[8byte] -> %xmm8[8byte] 0x00005633d5d2f4d1 65 48 a1 00 00 00 00 mov %gs:0x00[8byte] -> %rax 00 00 00 00 0x00005633d5d2f4dc 65 48 8b 0c 25 b0 00 mov %gs:0xb0[8byte] -> %rcx
The signal hits on 0x00005633d5d2f4d1. The vfnmadd132sd in the app is at 0x00007fcb6b3b060a. After translating the app's state, the PC is 0x00007fcb6b3b060a, but I think should be 0x00007fcb6b3b0613.
This is likely be the reason for another crash I am observing w.r.t. xref #2941 .