Open rkgithubs opened 5 years ago
The attached files are only the global log file (for each of 2 separate processes: not sure why 2 were posted, with different naming schemes?): all the thread logfiles are missing.
Both global logs do not show any kind of DR-detected error or DR-aware process exit: did the process exit? If so, what was the exit code? Why did it exit?
The output above just ends in the "get_memory_info mismatch" line -- what happens after that? Is it still running? Please describe the full symptoms as it is not clear what is happening.
Although memory utilization is roughly 11GB which is quite high for sparse.small
I don't see much memory at all being used before the end of the log file based on the reduction in free blocks inside DR's vmm:
$ for i in vmcode vmheap; do grep "^vmm_heap_reserve_blocks.*${i}.*free" java.0.59563.txt | (head -n 1; echo "..."; tail -n 1); done
vmm_heap_reserve_blocks vmcode: size=32768 => 32768 in blocks=8 free_blocks=262144
...
vmm_heap_reserve_blocks vmcode: size=65536 => 65536 in blocks=16 free_blocks=259685
vmm_heap_reserve_blocks vmheap: size=32768 => 32768 in blocks=8 free_blocks=131072
...
vmm_heap_reserve_blocks vmheap: size=262144 => 262144 in blocks=64 free_blocks=125399
Xref #2989, #3586, #2506
two log files are for two different runs. I can look for thread log files and send them.
application stops after 60 sec (used -s 60). Java PID => 65189. attached all log files in the zipped folder. I tried to grep for the error but there are no error messages. I stopped the app after 60 sec cause it runs for a long time and doesn't start the warm-up phase of the app (mentioned in the previous comment). java.65189.00000000.zip
The output above just ends in the "get_memory_info mismatch" line -- this is output for first 60 sec. There is no output msg on console for another few mins.
-loglevel 3 is expected to be slow. I believe your 60-second time killed the debug run before it hit any errors. The process logs look normal: no errors, just truncated. The debug run needs a longer timeout.
java.65189.00000000.zip contains an empty directory.
If it seems hung, the typical approach is to attach a debugger and get callstacks.
log directory tarball is ~250MB. Please suggest how to share it.
Hi. We are trying to investigate crashes inside jvm under dynamorio. https://groups.google.com/u/0/g/dynamorio-users/c/hSgenyAM5gM
The crashes are reproduced without clients too.
To simplify reproducer I used helloworld benchmark with minimum actions and disabling cache traces
.build/bin64/drrun -disable_traces -- java -jar SPECjvm2008.jar -ikv -i 1 -it 1 -wt 0 -ict helloworld
A fatal error has been detected by the Java Runtime Environment:
Internal Error (sharedRuntime.cpp:553), pid=159762, tid=0x00007fe02331e700
guarantee(cb != NULL && cb->is_nmethod()) failed: safepoint polling: pc must refer to an nmethod
JRE version: OpenJDK Runtime Environment (8.0_292-b10) (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10)
Java VM: OpenJDK 64-Bit Server VM (25.292-b10 mixed mode linux-amd64 compressed oops)
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
An error report file with more information is saved as:
/home/huawei/workspace/SPECjvm2008/build/release/SPECjvm2008/hs_err_pid159762.log
If you would like to submit a bug report, please visit:
http://bugreport.java.com/bugreport/crash.jsp
Aborted (core dumped)
Also I'ver tried to reuse arrays example from #2989 It is more stable but have crashes too (1 from 5 runs)
.build/bin64/drrun -disable_traces -- java arrays
A fatal error has been detected by the Java Runtime Environment:
SIGSEGV (0xb) at pc=0x00007fdc6176f062, pid=159861, tid=0x00007fd9f7dfd700
JRE version: OpenJDK Runtime Environment (8.0_292-b10) (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10)
Java VM: OpenJDK 64-Bit Server VM (25.292-b10 mixed mode linux-amd64 compressed oops)
Problematic frame:
V [libjvm.so+0x2ed062]
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
An error report file with more information is saved as:
/home/huawei/workspace/SPECjvm2008/_my_examples/hs_err_pid159861.log
[error occurred during error reporting , id 0xb]
If you would like to submit a bug report, please visit:
http://bugreport.java.com/bugreport/crash.jsp
[thread 140584498194176 also had an error]
Aborted (core dumped)
Could you tell me if there are some techniques to minimize dynamorio optimization (like disable tracing, maybe anything else?) How could I collect logs to stay instrumentation alive? Maybe you could share some debug tricks here?
Thanks, Kirill
I would try (prob all at once):
-no_hw_cache_consistency
-no_sandbox_writes
-no_enable_reset
See https://dynamorio.org/page_debugging.html. Use debug build and logging https://dynamorio.org/page_logging.html to help see what is going on. Try to diff a passing and crashing arrays run using logging: if app is deterministic enough can do direct control flow comparison and then machine state comparison at branch divergence points.
I would try (prob all at once):
-no_hw_cache_consistency
-no_sandbox_writes
-no_enable_reset
See https://dynamorio.org/page_debugging.html. Use debug build and logging https://dynamorio.org/page_logging.html to help see what is going on. Try to diff a passing and crashing arrays run using logging: if app is deterministic enough can do direct control flow comparison and then machine state comparison at branch divergence points.
Hi, Derek. BTW, what is 'sc' from debugging page? How could I get it in gdb during debigging coredump?
(gdb) info symbol **(sc->ebp+4)
get_memory_info + 647 in section .text
(gdb) info symbol **(**(sc->ebp)+4)
check_thread_vm_area + 7012 in section .text
(gdb) info symbol **(**(**(sc->ebp))+4)
check_new_page_start + 79 in section .text
(gdb) info symbol **(**(**(**(sc->ebp)))+4)
build_bb_ilist + 514 in section .text
Thanks, Kirill
BTW, what is 'sc' from debugging page? How could I get it in gdb during debigging coredump?
A local variable holding a signal context in DR's signal handler or functions it calls. This would only be relevant if inside the signal handling code and wanting to look at the interrupted context.
I would try (prob all at once):
-no_hw_cache_consistency
-no_sandbox_writes
-no_enable_reset
See https://dynamorio.org/page_debugging.html. Use debug build and logging https://dynamorio.org/page_logging.html to help see what is going on. Try to diff a passing and crashing arrays run using logging: if app is deterministic enough can do direct control flow comparison and then machine state comparison at branch divergence points.
Hi, We have got stable reproducer on compress benchmark (SPECjvm2008). The process constantly crashes on load/store instruction referencing memory at the first page (<0x1000). We are looking into the logs generated by the command like this:
../drio_log/DynamoRIO-Linux-8.0.18611/bin64/drrun \
-debug -disable_traces \
-private_trace_ibl_targets_init 18 -private_bb_ibl_targets_init 16 \
-shared_ibt_table_bb_init 18 -shared_ibt_table_trace_init 18 \
-logmask 0x637fe -loglevel 4 -syntax_att -- \
/opt/openjdk8u/jvm/openjdk-1.8.0-internal-debug/bin/java \
-XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:+PrintAssembly Main > drio_log.txt
after pausing the process just before crash (inserted dr_sleep() in DynamoRIO code right after log printing string) and attaching the process using hsdb debugger to have JVM related heap addresses annotated by JVM internal information.
Could you share some BKMs or ideas about how to unwind execution back from faulting instruction to find register context divergence point and then hypothesize about the possible reasons of divergence?
It seems like we have got control over DynamoRIO interpretation and translation in runtime so the next step is to learn how to extract the failure reasons from the logs and memory of JVM process using hsdb.
Thanks, Alexei
As mentioned, the best case is to have one successful run to that point to compare against, with no non-determinism. Then the first branch divergence point is found, and then the machine state dumped at -loglevel 4
at each cache entry can be compared backward from that point.
Having the final mangled in-cache instruction list plus the machine state (-loglevel 4
), a bad register value can be traced backward manually through the logs with some success, but indirect branches and linked sequences and being passed through memory all add complications.
If there is no true code modification (i.e., the JIT is always appending and never re-writing memory that had one instance of code with a new instance), I would try the options from above about disabling the cache consistency/selfmod handling to try to isolate which aspect of DR's handling has the problem.
I would look at code marked as "selfmod" (writable) which has a lot of instrumentation added to detect true self-modifying code: there could be bugs lurking in there. But if -no_sandbox_writes
made no difference -- don't remember if that disables all the selfmod instrumentation or just the post-write stuff.
Other than cache consistency, other possible areas where bugs could lurk:
Running on AArch64, if easy to do, could be a way to try to compare w/ different cache consistency schemes.
Thanks for advices. We will check thru the list. I appreciate that much. Could you please also clarify "mangling" term in DynamoRIO specifics context? What does that mean?
Could you please also clarify "mangling" term in DynamoRIO specifics context?
Code transformations performed by DR itself (as opposed to a tool/client) to maintain control are called "mangling". They occur last just prior to emitting into the code cache.
ok. thanks!
As mentioned, the best case is to have one successful run to that point to compare against, with no non-determinism. Then the first branch divergence point is found, and then the machine state dumped at
-loglevel 4
at each cache entry can be compared backward from that point.Having the final mangled in-cache instruction list plus the machine state (
-loglevel 4
), a bad register value can be traced backward manually through the logs with some success, but indirect branches and linked sequences and being passed through memory all add complications.If there is no true code modification (i.e., the JIT is always appending and never re-writing memory that had one instance of code with a new instance), I would try the options from above about disabling the cache consistency/selfmod handling to try to isolate which aspect of DR's handling has the problem.
I would look at code marked as "selfmod" (writable) which has a lot of instrumentation added to detect true self-modifying code: there could be bugs lurking in there. But if
-no_sandbox_writes
made no difference -- don't remember if that disables all the selfmod instrumentation or just the post-write stuff.Other than cache consistency, other possible areas where bugs could lurk:
- Segment usage: does the JVM use custom TLS with its own segment?
- Newer system calls: does the JVM use syscalls added to Linux "recently" that DR doesn't handle (e.g., timerfd_create add support for SYS_timerfd_create #1139, sigtimedwait handle sigtimedwait and sigwaitinfo #1188)
- Signals: if the JVM has an itimer or other signals try disabling those; but seems less likely given the non-Java apps that have no trouble w/ itimers
- Rip-relative unreachable mangling: though again seems less likely
Running on AArch64, if easy to do, could be a way to try to compare w/ different cache consistency schemes.
Hi,
Answering some of the questions above:
Meanwhile we managed to reproduce translation failure below on SPECjvm2008 compress benchmark:
<Resetting caches and non-persistent memory @ 720191 fragments in application /opt/openjdk8u/jvm/openjdk-1.8.0-internal-debug/bin/java (2727674).>
- JVM generates Rip-relative code a lot. Are there any limitations in DynamoRIO translation related to that?
DR should handle it just fine. It does require a local spilled register for far-away addresses. Maybe there could be fencepost errors on reachability or scratch register issues but rip-rel accesses are quite common so you would think such problems would not be limited to Java.
- JVM implementation relies on signals, especially on SIGSEGV. Not sure it can be harmlessly disabled;
- Have not checked usage of latest syscalls yet. We use openjdk8u built from sources on kernel 3.10 with libc 2.17 and gcc 4.8.5;
- However JVM does use TLS it is unlikely custom since JVM failures under DynamoRIO have sporadic nature;
Meanwhile we managed to reproduce translation failure below on SPECjvm2008 compress benchmark:
<Resetting caches and non-persistent memory @ 720191 fragments in application /opt/openjdk8u/jvm/openjdk-1.8.0-internal-debug/bin/java (2727674).>
I would run with -no_enable_reset to eliminate the complexity of resets.
Entry into F720469(0x00007f32112e1743).0x00007f3425cefc79 (shared) master_signal_handler: thread=2727675, sig=11, xsp=0x00007f3224ffb9b8, retaddr=0x00007f3468f395ab computing memory target for 0x00007f3425cefc88 causing SIGSEGV, kernel claims it is 0x00007f3468c58000 memory operand 0 has address 0x00007f3468c58000 and size 4 For SIGSEGV at cache pc 0x00007f3425cefc88, computed target read 0x00007f3468c58000 faulting instr: test (%r10), %eax ** Received SIGSEGV at cache pc 0x00007f3425cefc88 in thread 2727675 record_pending_signal(11) from cache pc 0x00007f3425cefc88 not certain can delay so handling now fragment overlaps selfmod area, inserting sandboxing
Translating state inside selfmod-sandboxed code is definitely an area where there could well be bugs.
Any ideas on why cache pc is 4 bytes away from faulting instruction whereas the distance between the instructions at generated JIT code is 3 bytes?
That is likely the bug causing the problems (or one of them): leading to incorrect register restoration or something.
Below is the example, from the same log, of successful translation of JIT code accessing the same faulting address:
Entry into F529662(0x00007f321127a40a).0x00007f3425bd144d (shared)
So the case where the same address does successfully translate has it as a regular (non-selfmod) fragment right? So it increasing looks like a selfmod translation problem.
I would create a tiny app with this precise block that modifies code on the same page (or, put it on the stack) so it gets marked selfmod and see if you can repro the translation problem and then easily run it repeatedly (b/c it's so small) w/ increasing diagnostics/logging/debugging.
- JVM generates Rip-relative code a lot. Are there any limitations in DynamoRIO translation related to that?
DR should handle it just fine. It does require a local spilled register for far-away addresses. Maybe there could be fencepost errors on reachability or scratch register issues but rip-rel accesses are quite common so you would think such problems would not be limited to Java.
Does local spilled register mean a) preserve a register value on stack or somewhere in memory, b) put far address into the register c) implement equal instructions using value addressed via the register d) restore original value of the register back? Are fencepost errors on reachability kind of races due lack of barriers because of machine code changes due to DRIO-mangling? References to docs clarifying that terms are appreciated as well.
- JVM implementation relies on signals, especially on SIGSEGV. Not sure it can be harmlessly disabled;
- Have not checked usage of latest syscalls yet. We use openjdk8u built from sources on kernel 3.10 with libc 2.17 and gcc 4.8.5;
- However JVM does use TLS it is unlikely custom since JVM failures under DynamoRIO have sporadic nature;
Meanwhile we managed to reproduce translation failure below on SPECjvm2008 compress benchmark: <Resetting caches and non-persistent memory @ 720191 fragments in application /opt/openjdk8u/jvm/openjdk-1.8.0-internal-debug/bin/java (2727674).>
I would run with -no_enable_reset to eliminate the complexity of resets.
BTW, we spotted one more failure that happens after SIGUSR2 translation resulting in "Exit due to proactive reset" message in DR logs. Could you please elaborate more on what reset means in DRIO context and what it does? References to docs clarifying this term are appreciated as well.
Entry into F329048(0x00007fdcc1252a62).0x00007fded6d565ed (shared) master_signal_handler: thread=2655048, sig=12, xsp=0x00007fdcd60d89b8, retaddr=0x00007fdf1a0585ab siginfo: sig = 12, pid = 2655035, status = 0, errno = 0, si_code = -6 gs=0x0000 fs=0x0000 xdi=0x00007fdcd60b4490 xsi=0x00007fdcd60b44a0 xbp=0x00007fdcd60b44c0 xsp=0x00007fdcd60b4470 xbx=0x0000000000000002 xdx=0x00007fdcd60b4490 ... handle_suspend_signal: suspended now translate_from_synchall_to_dispatch: being translated from 0x00007fdf1a0bf28a handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fdf1a0bf28a
save_fpstate thread_set_self_context: pc=0x00007fded5bbfd71 full sigcontext ===>Exit due to proactive reset
d_r_dispatch: target = 0x00007fdcc1252a62 priv_mcontext_t @0x00007fdcd5c361c0 xax = 0x00000000edf09ca5 xbx = 0x000000000031012b xcx = 0x0000000000000284 xdx = 0x000000000035c000 xsi = 0x000000076db9a838 xdi = 0x000000076f80aef0 ...
Entry into F329048(0x00007fdcc1252a62).0x00007fded6d565ed (shared)
fcache_enter = 0x00007fded5bbed00, target = 0x00007fded6d565ed
master_signal_handler: thread=2655048, sig=11, xsp=0x00007fdcd60d89b8, retaddr=0x00007fdf1a0585ab siginfo: sig = 11, pid = 659, status = 0, errno = 0, si_code = 1 gs=0x0000 fs=0x0000 xdi=0x000000076f80aef0 xsi=0x0000000000000000 xbp=0x000000076f84e528 xsp=0x00007fdf180a54e0 xbx=0x000000000031012b xdx=0x000000000035c000 xcx=0x0000000000000284 xax=0x00000000edf09ca5 ...
computing memory target for 0x00007fded6d565f2 causing SIGSEGV, kernel claims it is 0x0000000000000293
Entry into F720469(0x00007f32112e1743).0x00007f3425cefc79 (shared) master_signal_handler: thread=2727675, sig=11, xsp=0x00007f3224ffb9b8, retaddr=0x00007f3468f395ab computing memory target for 0x00007f3425cefc88 causing SIGSEGV, kernel claims it is 0x00007f3468c58000 memory operand 0 has address 0x00007f3468c58000 and size 4 For SIGSEGV at cache pc 0x00007f3425cefc88, computed target read 0x00007f3468c58000 faulting instr: test (%r10), %eax ** Received SIGSEGV at cache pc 0x00007f3425cefc88 in thread 2727675 record_pending_signal(11) from cache pc 0x00007f3425cefc88 not certain can delay so handling now fragment overlaps selfmod area, inserting sandboxing
Translating state inside selfmod-sandboxed code is definitely an area where there could well be bugs.
Any ideas on why cache pc is 4 bytes away from faulting instruction whereas the distance between the instructions at generated JIT code is 3 bytes?
That is likely the bug causing the problems (or one of them): leading to incorrect register restoration or something.
Below is the example, from the same log, of successful translation of JIT code accessing the same faulting address: Entry into F529662(0x00007f321127a40a).0x00007f3425bd144d (shared)
So the case where the same address does successfully translate has it as a regular (non-selfmod) fragment right? So it increasing looks like a selfmod translation problem.
Failed code address is not the same but instructions in those two failed and succeeded code addresses are quite similar. Physically they are different pieces of code, in different JIT compiled methods, but the pieces implement the same logic of checking value at the same global address located in JVM (static code implemented in C++, libjvm.so).
I would create a tiny app with this precise block that modifies code on the same page (or, put it on the stack) so it gets marked selfmod and see if you can repro the translation problem and then easily run it repeatedly (b/c it's so small) w/ increasing diagnostics/logging/debugging.
Yep, makes sense. Following that approach already.
Thanks!!!
Also xref using JVM annotations to avoid DR having to worry about true self-modifying code and having to use complex instrumentation to handle code changes on must-remain-writable pages: #3502 with some experimental code for an academic paper that was never merged into the main branch.
Does local spilled register mean a) preserve a register value on stack or somewhere in memory, b) put far address into the register c) implement equal instructions using value addressed via the register d) restore original value of the register back?
Yes, but the app stack is of course unsafe to use: thread-local storage via segment ref.
Are fencepost errors on reachability kind of races due lack of barriers because of machine code changes due to DRIO-mangling?
No, on whether it will reach or not (have to figure out ahead of time before place in code cache at final location). But I doubt there are bugs relating to rip-rel.
BTW, we spotted one more failure that happens after SIGUSR2 translation resulting in "Exit due to proactive reset" message in DR logs. Could you please elaborate more on what reset means in DRIO context and what it does? References to docs clarifying this term are appreciated as well.
"Reset" is an internal DR feature where it deletes all memory that can be re-created later: mostly code caches and associated heap. Kind of a garbage collection. Used to save memory and throw out cold code. I don't think there are any docs outside of the code itself.
Observed memory corruptions in compiled nmethod (data and code) that JVM compiler thread (static code) tries to install from stack into JVM code cache on heap. To do that installation the thread calls libc memcpy() and executes following code. %rsi points to stack, %rdi points to heap.
0x7fcc635737ec: vmovdqu (%rsi),%ymm0 0x7fcc635737f0: vmovdqu 0x20(%rsi),%ymm1 0x7fcc635737f5: vmovdqu 0x40(%rsi),%ymm2 0x7fcc635737fa: vmovdqu 0x60(%rsi),%ymm3 0x7fcc635737ff: vmovdqu -0x20(%rsi,%rdx,1),%ymm4 0x7fcc63573805: vmovdqu -0x40(%rsi,%rdx,1),%ymm5 0x7fcc6357380b: vmovdqu -0x60(%rsi,%rdx,1),%ymm6 0x7fcc63573811: vmovdqu -0x80(%rsi,%rdx,1),%ymm7 0x7fcc63573817: vmovdqu %ymm0,(%rdi) <= SIGSEGV 0x7fcc6357381b: vmovdqu %ymm1,0x20(%rdi) 0x7fcc63573820: vmovdqu %ymm2,0x40(%rdi) 0x7fcc63573825: vmovdqu %ymm3,0x60(%rdi) 0x7fcc6357382a: vmovdqu %ymm4,-0x20(%rdi,%rdx,1) 0x7fcc63573830: vmovdqu %ymm5,-0x40(%rdi,%rdx,1) 0x7fcc63573836: vmovdqu %ymm6,-0x60(%rdi,%rdx,1) 0x7fcc6357383c: vmovdqu %ymm7,-0x80(%rdi,%rdx,1)
Instruction at 0x7fcc63573817 causes multiple SIGSEGVs under dynamorio and the instruction restarts again and again. At some that restarting gets two SIGUSR2 signals in a row (JVM employs SIGUSR2 by default to suspend/resume threads in JVM) and that ends up in zeroing vector registers in hw context of the thread and the compiled code becomes corrupted while being copied:
Before handling chained SIGUSR2 signals:
d_r_dispatch: target = 0x00007fcc63573817 priv_mcontext_t @0x00007fca23795e40 xax = 0x00007fca1c811c60 xbx = 0x00007fca1c811c60 xcx = 0x00007fca1c810360 xdx = 0x00000000000000f0 xsi = 0x00007fca1c810360 xdi = 0x00007fca1c811c60 xbp = 0x00007fc91c584420 xsp = 0x00007fc91c5843f8 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x0000000000000293 r12 = 0x000000000000001e r13 = 0x00007fca18029230 r14 = 0x00007fca1819bc20 r15 = 0x00000000ffffffff ymm0= 0x08463b484916850f9090fff6909090900024848955fffea048ec8b484810ec83 ymm1= 0x082444c7badb100dc48b48500fe0834808f883487f840f5848000000d8246489 ymm2= 0x80ec81484800000078244489244c894854894870894868244860245c50246c89 ymm3= 0x247489487c894848894c40244c38244430244c892454894c5c894c28894c2024 ymm4= 0xd1f2684241c223440b41c08bc48348c2ba495d106383b00000007fccc3028541 ymm5= 0x7fcc6274ff410000d12bf4d241da8b44fac1cbfffbc1411fd333411fc4d32341 ymm6= 0xe8bf4824cc62cdeb4800007f81039bbe007fca1c + d48b4800f0e483487e6aba49 ymm7= 0x30244c892454894c5c894c28894c20244c18246410246c892474894c3c894c08 ymm8= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm9= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm10= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm11= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm12= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm13= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm14= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm15= 0x0000000000000000000000000000000000000000000000000000000000000000 mxcsr=0x00001f80 eflags = 0x0000000000000202 pc = 0x00007fcc202661f4 Entry into F171512(0x00007fcc63573817).0x00007fcc202661f4 (shared)
After handling chained SIGUSR2 signals:
_r_dispatch: target = 0x00007fcc63573817 priv_mcontext_t @0x00007fca23795e40 xax = 0x00007fca1c811c60 xbx = 0x00007fca1c811c60 xcx = 0x00007fca1c810360 xdx = 0x00000000000000f0 xsi = 0x00007fca1c810360 xdi = 0x00007fca1c811c60 xbp = 0x00007fc91c584420 xsp = 0x00007fc91c5843f8 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x0000000000000293 r12 = 0x000000000000001e r13 = 0x00007fca18029230 r14 = 0x00007fca1819bc20 r15 = 0x00000000ffffffff ymm0= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm1= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm2= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm3= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm4= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm5= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm6= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm7= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm8= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm9= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm10= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm11= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm12= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm13= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm14= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm15= 0x0000000000000000000000000000000000000000000000000000000000000000 mxcsr=0x00001f80 eflags = 0x0000000000000202 pc = 0x00007fcc63573817 Entry into F171512(0x00007fcc63573817).0x00007fcc202661f4 (shared)
Part of log around handling SIGUSR2 signals follows:
d_r_dispatch: target = 0x00007fcc63573817 priv_mcontext_t @0x00007fca23795e40 xax = 0x00007fca1c811c60 xbx = 0x00007fca1c811c60 xcx = 0x00007fca1c810360 xdx = 0x00000000000000f0 xsi = 0x00007fca1c810360 xdi = 0x00007fca1c811c60 xbp = 0x00007fc91c584420 xsp = 0x00007fc91c5843f8 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x0000000000000293 r12 = 0x000000000000001e r13 = 0x00007fca18029230 r14 = 0x00007fca1819bc20 r15 = 0x00000000ffffffff ymm0= 0x08463b484916850f9090fff6909090900024848955fffea048ec8b484810ec83 ymm1= 0x082444c7badb100dc48b48500fe0834808f883487f840f5848000000d8246489 ymm2= 0x80ec81484800000078244489244c894854894870894868244860245c50246c89 ymm3= 0x247489487c894848894c40244c38244430244c892454894c5c894c28894c2024 ymm4= 0xd1f2684241c223440b41c08bc48348c2ba495d106383b00000007fccc3028541 ymm5= 0x7fcc6274ff410000d12bf4d241da8b44fac1cbfffbc1411fd333411fc4d32341 ymm6= 0xe8bf4824cc62cdeb4800007f81039bbe007fca1c master_signal_handler: thread=2868191, sig=12, xsp=0x00007fca23b009b8, retaddr=0x00007fcc63b1c5ab siginfo: sig = 12, pid = 2868012, status = 0, errno = 0, si_code = -6 gs=0x0000 fs=0x0000 xdi=0x000000000007ffb2 xsi=0x00007fca23adc410 xbp=0x00007fca23adc3b0 xsp=0x00007fca23adc380 xbx=0x0000000000000003 xdx=0x0000000000000008 xcx=0x00007fcc63b8322d xax=0x0000000000000008 r8=0x0000000000000008 r9=0x00007fca1c810360 r10=0x00007fca23adc410 r11=0x0000000000000246 r12=0x00007fca1c811c60 r13=0x00007fc91c584420 r14=0x00007fc91c5843f8 r15=0x0000000000000000 trapno=0x000000000000000e err=0x0000000000000007 xip=0x00007fcc63b8322d cs=0x0033 eflags=0x0000000000000246 cwd=0x000000000000037f swd=0x0000000000000000 twd=0x0000000000000000 fop=0x0000000000000000 rip=0x0000000000000000 rdp=0x0000000000000000 mxcsr=0x0000000000001f80 mxcsr_mask=0x000000000000ffff st0 = 0x00000000000000000000000000000000 st1 = 0x00000000000000000000000000000000 st2 = 0x00000000000000000000000000000000 st3 = 0x00000000000000000000000000000000 st4 = 0x00000000000000000000000000000000 st5 = 0x00000000000000000000000000000000 st6 = 0x00000000000000000000000000000000 st7 = 0x00000000000000000000000000000000 xmm0 = 0x08463b484916850f9090fff690909090 xmm1 = 0x082444c7badb100dc48b48500fe08348 xmm2 = 0x80ec81484800000078244489244c8948 xmm3 = 0x247489487c894848894c40244c382444 xmm4 = 0xd1f2684241c223440b41c08bc48348c2 xmm5 = 0x7fcc6274ff410000d12bf4d241da8b44 xmm6 = 0xe8bf4824cc62cdeb4800007f81039bbe xmm7 = 0x30244c892454894c5c894c28894c2024 xmm8 = 0x00000000000000000000000000000000 xmm9 = 0x00000000000000000000000000000000 xmm10 = 0x00000000000000000000000000000000 xmm11 = 0x00000000000000000000000000000000 xmm12 = 0x00000000000000000000000000000000 xmm13 = 0x00000000000000000000000000000000 xmm14 = 0x00000000000000000000000000000000 xmm15 = 0x00000000000000000000000000000000 xstate_bv = 0x7 ymmh0 = 0024848955fffea048ec8b484810ec83 ymmh1 = 08f883487f840f5848000000d8246489 ymmh2 = 54894870894868244860245c50246c89 ymmh3 = 30244c892454894c5c894c28894c2024 ymmh4 = ba495d106383b00000007fccc3028541 ymmh5 = fac1cbfffbc1411fd333411fc4d32341 ymmh6 = 007fca1cd48b4800f0e483487e6aba49 ymmh7 = 4c18246410246c892474894c3c894c08 ymmh8 = 00000000000000000000000000000000 ymmh9 = 00000000000000000000000000000000 ymmh10 = 00000000000000000000000000000000 ymmh11 = 00000000000000000000000000000000 ymmh12 = 00000000000000000000000000000000 ymmh13 = 00000000000000000000000000000000 ymmh14 = 00000000000000000000000000000000 ymmh15 = 00000000000000000000000000000000 oldmask=0x0000000000000000 cr2=0x00007fca1c811c60 handle_suspend_signal: suspended now handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fcc63b8322d
d48b4800f0e483487e6aba49 ymm7= 0x30244c892454894c5c894c28894c20244c18246410246c892474894c3c894c08 ymm8= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm9= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm10= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm11= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm12= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm13= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm14= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm15= 0x0000000000000000000000000000000000000000000000000000000000000000 mxcsr=0x00001f80 eflags = 0x0000000000000202 pc = 0x00007fcc202661f4 Entry into F171512(0x00007fcc63573817).0x00007fcc202661f4 (shared)
master_signal_handler: thread=2868191, sig=12, xsp=0x00007fca23b009b8, retaddr=0x00007fcc63b1c5ab siginfo: sig = 12, pid = 2868012, status = 0, errno = 0, si_code = -6 gs=0x0000 fs=0x0000 xdi=0x00007fca23adc490 xsi=0x00007fca23adc4a0 xbp=0x00007fca23adc4c0 xsp=0x00007fca23adc470 xbx=0x0000000000000002 xdx=0x00007fca23adc490 xcx=0x00007fcc63b8322d xax=0x0000000000000000 r8=0x0000000000000000 r9=0x0000000000000000 r10=0x00007fca23adc4a0 r11=0x0000000000000246 r12=0x000000000000001e r13=0x00007fca18029230 r14=0x00007fca1819bc20 r15=0x00000000ffffffff trapno=0x000000000000000e err=0x0000000000000007 xip=0x00007fcc63b8322d cs=0x0033 eflags=0x0000000000000246 cwd=0x000000000000037f swd=0x0000000000000000 twd=0x0000000000000000 fop=0x0000000000000000 rip=0x0000000000000000 rdp=0x0000000000000000 mxcsr=0x0000000000001f80 mxcsr_mask=0x000000000000ffff st0 = 0x00000000000000000000000000000000 st1 = 0x00000000000000000000000000000000 st2 = 0x00000000000000000000000000000000 st3 = 0x00000000000000000000000000000000 st4 = 0x00000000000000000000000000000000 st5 = 0x00000000000000000000000000000000 st6 = 0x00000000000000000000000000000000 st7 = 0x00000000000000000000000000000000 xmm0 = 0x08463b484916850f9090fff690909090 xmm1 = 0x082444c7badb100dc48b48500fe08348 xmm2 = 0x80ec81484800000078244489244c8948 xmm3 = 0x247489487c894848894c40244c382444 xmm4 = 0xd1f2684241c223440b41c08bc48348c2 xmm5 = 0x7fcc6274ff410000d12bf4d241da8b44 xmm6 = 0xe8bf4824cc62cdeb4800007f81039bbe xmm7 = 0x30244c892454894c5c894c28894c2024 xmm8 = 0x00000000000000000000000000000000 xmm9 = 0x00000000000000000000000000000000 xmm10 = 0x00000000000000000000000000000000 xmm11 = 0x00000000000000000000000000000000 xmm12 = 0x00000000000000000000000000000000 xmm13 = 0x00000000000000000000000000000000 xmm14 = 0x00000000000000000000000000000000 xmm15 = 0x00000000000000000000000000000000 xstate_bv = 0x7 ymmh0 = 0024848955fffea048ec8b484810ec83 ymmh1 = 08f883487f840f5848000000d8246489 ymmh2 = 54894870894868244860245c50246c89 ymmh3 = 30244c892454894c5c894c28894c2024 ymmh4 = ba495d106383b00000007fccc3028541 ymmh5 = fac1cbfffbc1411fd333411fc4d32341 ymmh6 = 007fca1cd48b4800f0e483487e6aba49 ymmh7 = 4c18246410246c892474894c3c894c08 ymmh8 = 00000000000000000000000000000000 ymmh9 = 00000000000000000000000000000000 ymmh10 = 00000000000000000000000000000000 ymmh11 = 00000000000000000000000000000000 ymmh12 = 00000000000000000000000000000000 ymmh13 = 00000000000000000000000000000000 ymmh14 = 00000000000000000000000000000000 ymmh15 = 00000000000000000000000000000000 oldmask=0x0000000000000000 cr2=0x00007fca1c811c60 handle_suspend_signal: suspended now translate_from_synchall_to_dispatch: being translated from 0x00007fcc63b8322d handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fcc63b8322d
save_fpstate thread_set_self_context: pc=0x00007fcc1f683d71 full sigcontext Exit due to proactive reset
d_r_dispatch: target = 0x00007fcc63573817 priv_mcontext_t @0x00007fca23795e40 xax = 0x00007fca1c811c60 xbx = 0x00007fca1c811c60 xcx = 0x00007fca1c810360 xdx = 0x00000000000000f0 xsi = 0x00007fca1c810360 xdi = 0x00007fca1c811c60 xbp = 0x00007fc91c584420 xsp = 0x00007fc91c5843f8 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x0000000000000293 r12 = 0x000000000000001e r13 = 0x00007fca18029230 r14 = 0x00007fca1819bc20 r15 = 0x00000000ffffffff ymm0= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm1= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm2= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm3= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm4= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm5= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm6= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm7= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm8= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm9= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm10= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm11= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm12= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm13= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm14= 0x0000000000000000000000000000000000000000000000000000000000000000 ymm15= 0x0000000000000000000000000000000000000000000000000000000000000000 mxcsr=0x00001f80 eflags = 0x0000000000000202 pc = 0x00007fcc63573817 Entry into F171512(0x00007fcc63573817).0x00007fcc202661f4 (shared)
Log of handling SEGV follows:
Entry into F171512(0x00007fcc63573817).0x00007fcc202661f4 (shared)
fcache_enter = 0x00007fcc1f682d00, target = 0x00007fcc202661f4
master_signal_handler: thread=2868191, sig=11, xsp=0x00007fca23b009b8, retaddr=0x00007fcc63b1c5ab siginfo: sig = 11, pid = 478223456, status = 0, errno = 0, si_code = 2 gs=0x0000 fs=0x0000 xdi=0x00007fca1c811c60 xsi=0x00007fca1c810360 xbp=0x00007fc91c584420 xsp=0x00007fc91c5843f8 xbx=0x00007fca1c811c60 xdx=0x00000000000000f0 xcx=0x00007fca1c810360 xax=0x00007fca1c811c60 r8=0x0000000000000000 r9=0x0000000000000000 r10=0x0000000000000000 r11=0x0000000000000293 r12=0x000000000000001e r13=0x00007fca18029230 r14=0x00007fca1819bc20 r15=0x00000000ffffffff trapno=0x000000000000000e err=0x0000000000000007 xip=0x00007fcc202661f4 cs=0x0033 eflags=0x0000000000010202 cwd=0x000000000000037f swd=0x0000000000000000 twd=0x0000000000000000 fop=0x0000000000000000 rip=0x0000000000000000 rdp=0x0000000000000000 mxcsr=0x0000000000001f80 mxcsr_mask=0x000000000000ffff st0 = 0x00000000000000000000000000000000 st1 = 0x00000000000000000000000000000000 st2 = 0x00000000000000000000000000000000 st3 = 0x00000000000000000000000000000000 st4 = 0x00000000000000000000000000000000 st5 = 0x00000000000000000000000000000000 st6 = 0x00000000000000000000000000000000 st7 = 0x00000000000000000000000000000000 xmm0 = 0x08463b484916850f9090fff690909090 xmm1 = 0x082444c7badb100dc48b48500fe08348 xmm2 = 0x80ec81484800000078244489244c8948 xmm3 = 0x247489487c894848894c40244c382444 xmm4 = 0xd1f2684241c223440b41c08bc48348c2 xmm5 = 0x7fcc6274ff410000d12bf4d241da8b44 xmm6 = 0xe8bf4824cc62cdeb4800007f81039bbe xmm7 = 0x30244c892454894c5c894c28894c2024 xmm8 = 0x00000000000000000000000000000000 xmm9 = 0x00000000000000000000000000000000 xmm10 = 0x00000000000000000000000000000000 xmm11 = 0x00000000000000000000000000000000 xmm12 = 0x00000000000000000000000000000000 xmm13 = 0x00000000000000000000000000000000 xmm14 = 0x00000000000000000000000000000000 xmm15 = 0x00000000000000000000000000000000 xstate_bv = 0x7 ymmh0 = 0024848955fffea048ec8b484810ec83 ymmh1 = 08f883487f840f5848000000d8246489 ymmh2 = 54894870894868244860245c50246c89 ymmh3 = 30244c892454894c5c894c28894c2024 ymmh4 = ba495d106383b00000007fccc3028541 ymmh5 = fac1cbfffbc1411fd333411fc4d32341 ymmh6 = 007fca1cd48b4800f0e483487e6aba49 ymmh7 = 4c18246410246c892474894c3c894c08 ymmh8 = 00000000000000000000000000000000 ymmh9 = 00000000000000000000000000000000 ymmh10 = 00000000000000000000000000000000 ymmh11 = 00000000000000000000000000000000 ymmh12 = 00000000000000000000000000000000 ymmh13 = 00000000000000000000000000000000 ymmh14 = 00000000000000000000000000000000 ymmh15 = 00000000000000000000000000000000 oldmask=0x0000000000000000 cr2=0x00007fca1c811c60 computing memory target for 0x00007fcc202661f4 causing SIGSEGV, kernel claims it is 0x00007fca1c811c60 opnd_compute_address for: (%rdi) base => 0x00007fca1c811c60 index,scale => 0x00007fca1c811c60 disp => 0x00007fca1c811c60 memory operand 0 has address 0x00007fca1c811c60 and size 32 For SIGSEGV at cache pc 0x00007fcc202661f4, computed target write 0x00007fca1c811c60 faulting instr: vmovdqu %ymm0, (%rdi) recreate_app_pc -- translating from pc=0x00007fcc202661f4
building bb instrlist now *****
interp: start_pc = 0x00007fcc63573817 check_thread_vm_area: pc = 0x00007fcc63573817 prepend_entry_to_fraglist: putting fragment @0x00007fcc63573817 (shared) on vmarea 0x00007fcc63436000-0x00007fcc63583000 check_thread_vm_area: check_stop = 0x00007fcc63583000 0x00007fcc63573817 c5 fe 7f 07 vmovdqu %ymm0, (%rdi) 0x00007fcc6357381b c5 fe 7f 4f 20 vmovdqu %ymm1, 0x20(%rdi) 0x00007fcc63573820 c5 fe 7f 57 40 vmovdqu %ymm2, 0x40(%rdi) 0x00007fcc63573825 c5 fe 7f 5f 60 vmovdqu %ymm3, 0x60(%rdi) 0x00007fcc6357382a c5 fe 7f 64 17 e0 vmovdqu %ymm4, -0x20(%rdi,%rdx) 0x00007fcc63573830 c5 fe 7f 6c 17 c0 vmovdqu %ymm5, -0x40(%rdi,%rdx) 0x00007fcc63573836 c5 fe 7f 74 17 a0 vmovdqu %ymm6, -0x60(%rdi,%rdx) 0x00007fcc6357383c c5 fe 7f 7c 17 80 vmovdqu %ymm7, -0x80(%rdi,%rdx) 0x00007fcc63573842 c5 f8 77 vzeroupper 0x00007fcc63573845 c3 ret mbr exit target = 0x00007fcc1f683540 end_pc = 0x00007fcc63573846
setting cur_pc (for fall-through) to 0x00007fcc63573846
forward_eflags_analysis: vmovdqu %ymm0, (%rdi)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm1, 0x20(%rdi)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm2, 0x40(%rdi)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm3, 0x60(%rdi)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm4, -0x20(%rdi,%rdx)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm5, -0x40(%rdi,%rdx)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm6, -0x60(%rdi,%rdx)
instr 0 => 0
forward_eflags_analysis: vmovdqu %ymm7, -0x80(%rdi,%rdx)
instr 0 => 0
forward_eflags_analysis: vzeroupper
instr 0 => 0
exit_branch_type=0x6 bb->exit_target=0x00007fcc1f683540
bb ilist before mangling:
TAG 0x00007fcc63573817
+0 L3 @0x00007fca23ae8f68 c5 fe 7f 07 vmovdqu %ymm0, (%rdi)
+4 L3 @0x00007fca23aebf50 c5 fe 7f 4f 20 vmovdqu %ymm1, 0x20(%rdi)
+9 L3 @0x00007fca23e5c648 c5 fe 7f 57 40 vmovdqu %ymm2, 0x40(%rdi)
+14 L3 @0x00007fca23ae70e8 c5 fe 7f 5f 60 vmovdqu %ymm3, 0x60(%rdi)
+19 L3 @0x00007fca23e5c0c8 c5 fe 7f 64 17 e0 vmovdqu %ymm4, -0x20(%rdi,%rdx)
+25 L3 @0x00007fca23ae8700 c5 fe 7f 6c 17 c0 vmovdqu %ymm5, -0x40(%rdi,%rdx)
+31 L3 @0x00007fca23ae8ce8 c5 fe 7f 74 17 a0 vmovdqu %ymm6, -0x60(%rdi,%rdx)
+37 L3 @0x00007fca23e5d048 c5 fe 7f 7c 17 80 vmovdqu %ymm7, -0x80(%rdi,%rdx)
+43 L3 @0x00007fca23ae7200 c5 f8 77 vzeroupper
+46 L3 @0x00007fca23e5cc48 c3 ret
+47 L4 @0x00007fca23e5c6c8 e9 8b 50 b8 fb jmp $0x00007fcc1f683540
bb ilist after mangling:
TAG 0x00007fcc63573817
+0 L3 @0x00007fca23ae8f68 c5 fe 7f 07 vmovdqu %ymm0, (%rdi)
+4 L3 @0x00007fca23aebf50 c5 fe 7f 4f 20 vmovdqu %ymm1, 0x20(%rdi)
+9 L3 @0x00007fca23e5c648 c5 fe 7f 57 40 vmovdqu %ymm2, 0x40(%rdi)
+14 L3 @0x00007fca23ae70e8 c5 fe 7f 5f 60 vmovdqu %ymm3, 0x60(%rdi)
+19 L3 @0x00007fca23e5c0c8 c5 fe 7f 64 17 e0 vmovdqu %ymm4, -0x20(%rdi,%rdx)
+25 L3 @0x00007fca23ae8700 c5 fe 7f 6c 17 c0 vmovdqu %ymm5, -0x40(%rdi,%rdx)
+31 L3 @0x00007fca23ae8ce8 c5 fe 7f 74 17 a0 vmovdqu %ymm6, -0x60(%rdi,%rdx)
+37 L3 @0x00007fca23e5d048 c5 fe 7f 7c 17 80 vmovdqu %ymm7, -0x80(%rdi,%rdx)
+43 L3 @0x00007fca23ae7200 c5 f8 77 vzeroupper
+46 m4 @0x00007fca23ae7600 65 48 89 0c 25 10 00 mov %rcx, %gs:0x10
00 00
+55 m4 @0x00007fca23e5c148 59 pop %rcx
+56 L4 @0x00007fca23e5c6c8 e9 8b 50 b8 fb jmp $0x00007fcc1f683540
done building bb instrlist *****
vm_area_remove_fragment: entry 0x00007fca24864668 nop_pad_ilist: F171512 @0x00007fcc2026622c cti shift needed: 0 recreate_app : pc is in F171512(0x00007fcc63573817) recreate_app : looking for 0x00007fcc202661f4 in frag @ 0x00007fcc202661f4 (tag 0x00007fcc63573817) recreate_app -- found valid state pc 0x00007fcc63573817 recreate_app -- found ok pc 0x00007fcc63573817 recreate_app_pc -- translation is 0x00007fcc63573817 WARNING: Exec 0x00007fca1c66f000-0x00007fca1c8df000 WE written @0x00007fca1c811c60 by 0x00007fcc202661f4 == app 0x00007fcc63573817 vm_list_overlaps 0x00007fca2485f438 vs 0x00007fca1c66f000-0x00007fca1c8df000 instr not in region, flushing entire 0x00007fca1c66f000-0x00007fca1c8df000 FLUSH STAGE 1: synch_unlink_priv(thread 2868191 flushtime 4444): 0x00007fca1c66f000-0x00007fca1c8df000 make_writable: pc 0x00007fcc1f682000 -> 0x00007fcc1f682000-0x00007fcc1f684000 0 make_unwritable: pc 0x00007fcc1f682000 -> 0x00007fcc1f682000-0x00007fcc1f684000 considering thread #1/20 = 2868012 thread 2868012 synch not required thread 2868012 has no fragments in region to flush considering thread #2/20 = 2868013 waiting for thread 2868013 thread 2868191 waiting for event 0x00007fca1f6fcbd0 thread 2868191 finished waiting for event 0x00007fca1f6fcbd0 done waiting for thread 2868013 thread 2868013 has no fragments in region to flush considering thread #3/20 = 2868015 thread 2868015 synch not required thread 2868015 has no fragments in region to flush considering thread #4/20 = 2868016 thread 2868016 synch not required thread 2868016 has no fragments in region to flush considering thread #5/20 = 2868017 thread 2868017 synch not required thread 2868017 has no fragments in region to flush considering thread #6/20 = 2868018 thread 2868018 synch not required thread 2868018 has no fragments in region to flush considering thread #7/20 = 2868019 thread 2868019 synch not required thread 2868019 has no fragments in region to flush considering thread #8/20 = 2868020 thread 2868020 synch not required thread 2868020 has no fragments in region to flush considering thread #9/20 = 2868021 thread 2868021 synch not required thread 2868021 has no fragments in region to flush considering thread #10/20 = 2868022 thread 2868022 synch not required thread 2868022 has no fragments in region to flush considering thread #11/20 = 2868023 thread 2868023 synch not required thread 2868023 has no fragments in region to flush considering thread #12/20 = 2868024 thread 2868024 synch not required thread 2868024 has no fragments in region to flush considering thread #13/20 = 2868159 thread 2868159 synch not required thread 2868159 has no fragments in region to flush considering thread #14/20 = 2868173 thread 2868173 synch not required thread 2868173 has no fragments in region to flush considering thread #15/20 = 2868182 thread 2868182 synch not required thread 2868182 has no fragments in region to flush considering thread #16/20 = 2868189 thread 2868189 synch not required thread 2868189 has no fragments in region to flush considering thread #17/20 = 2868190 thread 2868190 synch not required thread 2868190 has no fragments in region to flush considering thread #18/20 = 2868191 thread 2868191 synch not required thread 2868191 has no fragments in region to flush considering thread #19/20 = 2868192 thread 2868192 synch not required thread 2868192 has no fragments in region to flush considering thread #20/20 = 2868193 thread 2868193 synch not required thread 2868193 has no fragments in region to flush FLUSH STAGE 2: unlink_shared(thread 2868191): flusher is 2868191 flushing shared fragments vm_area_unlink_fragments 0x00007fca1c66f000..0x00007fca1c8df000 marking region 0x00007fca1c66f000..0x00007fca1c8df000 for deletion & unlinking all its frags Before removing vm area: 0x00007fc907de3000-0x00007fc907ffe000 ---- libnet.so 0x00007fc91c072000-0x00007fc91c283000 ---- libnio.so 0x00007fca1c66f000-0x00007fca1c8df000 W--- unexpected vm area 0x00007fcc5f670000-0x00007fcc5f671000 ---- ELF SO lib.so 0x00007fcc60627000-0x00007fcc60978000 ---- hsdis-amd64.so 0x00007fcc60b11000-0x00007fcc60d34000 ---- libzip.so 0x00007fcc6160a000-0x00007fcc6183a000 ---- libjava.so 0x00007fcc6183d000-0x00007fcc61a4e000 ---- libverify.so 0x00007fcc61a54000-0x00007fcc61a58000 ---- librt.so.1 0x00007fcc61b6c000-0x00007fcc61c06000 ---- libm.so.6 0x00007fcc61ca1000-0x00007fcc63287000 ---- libjvm.so 0x00007fcc63436000-0x00007fcc63583000 ---- libc.so.6 0x00007fcc635e1000-0x00007fcc635e3000 ---- libdl.so.2 0x00007fcc635e6000-0x00007fcc637ff000 ---- libjli.so 0x00007fcc63808000-0x00007fcc63817000 ---- libpthread.so.0 0x00007fcc6382f000-0x00007fcc63836000 ---- libnss_sss.so.2 0x00007fcc6383f000-0x00007fcc63864000 ---- ELF SO ld-linux-x86-64.so.2 0x00007ffe89515000-0x00007ffe89517000 ---- Private linux-vdso.so.1 Removing shared vm area 0x00007fca1c66f000-0x00007fca1c8df000 After removing vm area: 0x00007fc907de3000-0x00007fc907ffe000 ---- libnet.so 0x00007fc91c072000-0x00007fc91c283000 ---- libnio.so 0x00007fcc5f670000-0x00007fcc5f671000 ---- ELF SO lib.so 0x00007fcc60627000-0x00007fcc60978000 ---- hsdis-amd64.so 0x00007fcc60b11000-0x00007fcc60d34000 ---- libzip.so 0x00007fcc6160a000-0x00007fcc6183a000 ---- libjava.so 0x00007fcc6183d000-0x00007fcc61a4e000 ---- libverify.so 0x00007fcc61a54000-0x00007fcc61a58000 ---- librt.so.1 0x00007fcc61b6c000-0x00007fcc61c06000 ---- libm.so.6 0x00007fcc61ca1000-0x00007fcc63287000 ---- libjvm.so 0x00007fcc63436000-0x00007fcc63583000 ---- libc.so.6 0x00007fcc635e1000-0x00007fcc635e3000 ---- libdl.so.2 0x00007fcc635e6000-0x00007fcc637ff000 ---- libjli.so 0x00007fcc63808000-0x00007fcc63817000 ---- libpthread.so.0 0x00007fcc6382f000-0x00007fcc63836000 ---- libnss_sss.so.2 0x00007fcc6383f000-0x00007fcc63864000 ---- ELF SO ld-linux-x86-64.so.2 0x00007ffe89515000-0x00007ffe89517000 ---- Private linux-vdso.so.1 Unlinked 1 frags make_writable: pc 0x00007fcc1f682000 -> 0x00007fcc1f682000-0x00007fcc1f684000 0 make_unwritable: pc 0x00007fcc1f682000 -> 0x00007fcc1f682000-0x00007fcc1f684000 Flushed 1 fragments from 0x00007fca1c66f000-0x00007fca1c8df000 make_writable: pc 0x00007fca1c66f000 -> 0x00007fca1c66f000-0x00007fca1c8df000 0 Removed 0x00007fca1c66f000-0x00007fca1c8df000 from exec list, continuing @ write
executable areas: 0x00007fc907de3000-0x00007fc907ffe000 ---- libnet.so 0x00007fc91c072000-0x00007fc91c283000 ---- libnio.so 0x00007fcc5f670000-0x00007fcc5f671000 ---- ELF SO lib.so 0x00007fcc60627000-0x00007fcc60978000 ---- hsdis-amd64.so 0x00007fcc60b11000-0x00007fcc60d34000 ---- libzip.so 0x00007fcc6160a000-0x00007fcc6183a000 ---- libjava.so 0x00007fcc6183d000-0x00007fcc61a4e000 ---- libverify.so 0x00007fcc61a54000-0x00007fcc61a58000 ---- librt.so.1 0x00007fcc61b6c000-0x00007fcc61c06000 ---- libm.so.6 0x00007fcc61ca1000-0x00007fcc63287000 ---- libjvm.so 0x00007fcc63436000-0x00007fcc63583000 ---- libc.so.6 0x00007fcc635e1000-0x00007fcc635e3000 ---- libdl.so.2 0x00007fcc635e6000-0x00007fcc637ff000 ---- libjli.so 0x00007fcc63808000-0x00007fcc63817000 ---- libpthread.so.0 0x00007fcc6382f000-0x00007fcc63836000 ---- libnss_sss.so.2 0x00007fcc6383f000-0x00007fcc63864000 ---- ELF SO ld-linux-x86-64.so.2 0x00007fcc638ba000-0x00007fcc63b86000 ---- ELF SO libdynamorio.so 0x00007ffe89515000-0x00007ffe89517000 ---- Private linux-vdso.so.1 0xffffffffff600000-0xffffffffff601000 ---- Private
thread areas: 0x00007fcc63436000-0x00007fcc63583000 ---- libc.so.6 0x00007fcc63808000-0x00007fcc63817000 ---- libpthread.so.0 FLUSH STAGE 3: end_synch(thread 2868191): flusher is 2868191 thread 2868191 signalling event 0x00007fca237a0dd0 thread 2868191 signalling event 0x00007fca1f6fcc68 saved xax 0x00007fca1c811c60 set next_tag to 0x00007fcc63573817, resuming in fcache_return transfer_from_sig_handler_to_fcache_return sigcontext @0x00007fca23b009e8: gs=0x0000 fs=0x0000 xdi=0x00007fca1c811c60 xsi=0x00007fca1c810360 xbp=0x00007fc91c584420 xsp=0x00007fc91c5843f8 xbx=0x00007fca1c811c60 xdx=0x00000000000000f0 xcx=0x00007fca1c810360 xax=0x00007fcc63bac84c r8=0x0000000000000000 r9=0x0000000000000000 r10=0x0000000000000000 r11=0x0000000000000293 r12=0x000000000000001e r13=0x00007fca18029230 r14=0x00007fca1819bc20 r15=0x00000000ffffffff trapno=0x000000000000000e err=0x0000000000000007 xip=0x00007fcc1f682e00 cs=0x0033 eflags=0x0000000000010202 cwd=0x000000000000037f swd=0x0000000000000000 twd=0x0000000000000000 fop=0x0000000000000000 rip=0x0000000000000000 rdp=0x0000000000000000 mxcsr=0x0000000000001f80 mxcsr_mask=0x000000000000ffff st0 = 0x00000000000000000000000000000000 st1 = 0x00000000000000000000000000000000 st2 = 0x00000000000000000000000000000000 st3 = 0x00000000000000000000000000000000 st4 = 0x00000000000000000000000000000000 st5 = 0x00000000000000000000000000000000 st6 = 0x00000000000000000000000000000000 st7 = 0x00000000000000000000000000000000 xmm0 = 0x08463b484916850f9090fff690909090 xmm1 = 0x082444c7badb100dc48b48500fe08348 xmm2 = 0x80ec81484800000078244489244c8948 xmm3 = 0x247489487c894848894c40244c382444 xmm4 = 0xd1f2684241c223440b41c08bc48348c2 xmm5 = 0x7fcc6274ff410000d12bf4d241da8b44 xmm6 = 0xe8bf4824cc62cdeb4800007f81039bbe xmm7 = 0x30244c892454894c5c894c28894c2024 xmm8 = 0x00000000000000000000000000000000 xmm9 = 0x00000000000000000000000000000000 xmm10 = 0x00000000000000000000000000000000 xmm11 = 0x00000000000000000000000000000000 xmm12 = 0x00000000000000000000000000000000 xmm13 = 0x00000000000000000000000000000000 xmm14 = 0x00000000000000000000000000000000 xmm15 = 0x00000000000000000000000000000000 xstate_bv = 0x7 ymmh0 = 0024848955fffea048ec8b484810ec83 ymmh1 = 08f883487f840f5848000000d8246489 ymmh2 = 54894870894868244860245c50246c89 ymmh3 = 30244c892454894c5c894c28894c2024 ymmh4 = ba495d106383b00000007fccc3028541 ymmh5 = fac1cbfffbc1411fd333411fc4d32341 ymmh6 = 007fca1cd48b4800f0e483487e6aba49 ymmh7 = 4c18246410246c892474894c3c894c08 ymmh8 = 00000000000000000000000000000000 ymmh9 = 00000000000000000000000000000000 ymmh10 = 00000000000000000000000000000000 ymmh11 = 00000000000000000000000000000000 ymmh12 = 00000000000000000000000000000000 ymmh13 = 00000000000000000000000000000000 ymmh14 = 00000000000000000000000000000000 ymmh15 = 00000000000000000000000000000000 oldmask=0x0000000000000000 cr2=0x00007fca1c811c60 master_signal_handler 11 returning now to 0x00007fcc1f682e00
Exit from fragment via code mod thread 2868191 (flushtime 4444) walking pending deletion list (was_I_flushed==F0) Considering #0: 0x00007fca1c66f000..0x00007fca1c8df000 flushtime 4445 dec => ref_count is now 1, flushtime diff is 0 Considering #1: 0x00007fca1c66f000..0x00007fca1c8df000 flushtime 4444 (aborting now since rest have already been ok-ed) thread 2868191 done walking pending list @flushtime 4445 Flushed 0 frags
d_r_dispatch: target = 0x00007fcc63573817
Did not look in detail at logs but two thoughts:
gets two SIGUSR2 signals in a row
There was a bug fixed recently where DR would incorrectly nest a signal when the app did not set SA_NODEFER: #4998. Maybe worth testing w/ that fix if the issue seems to involve DR nesting and the app not handling nesting.
Second thought: DR uses SIGUSR2 for suspending threads. Maybe try swapping that to SIGUSR1 just to see if the issue involves a bug in DR's attempt to separate its own use from the app's.
If neither of those -- is ithe issue in app SIMD state discrepancy between signal queueing and delivery?
Did not look in detail at logs but two thoughts:
gets two SIGUSR2 signals in a row
There was a bug fixed recently where DR would incorrectly nest a signal when the app did not set SA_NODEFER: #4998. Maybe worth testing w/ that fix if the issue seems to involve DR nesting and the app not handling nesting.
It makes sense if the fix is in build newer than DynamoRIO-Linux-8.0.18611-1. Is it?
Second thought: DR uses SIGUSR2 for suspending threads. Maybe try swapping that to SIGUSR1 just to see if the issue involves a bug in DR's attempt to separate its own use from the app's. Already tried to change suspend/resume signal in JVM (to SIGPROF=27) and still observing the same behavior.
If neither of those -- is ithe issue in app SIMD state discrepancy between signal queueing and delivery?
It looks so. Before SIGUSR2s delivery the app SIMD state is valid and after that it is nullified. Diff in handling second SUGUSR2 looks like this:
handle_suspend_signal: suspended now translate_from_synchall_to_dispatch: being translated from 0x00007fcc63b8322d handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fcc63b8322d
save_fpstate thread_set_self_context: pc=0x00007fcc1f683d71 full sigcontext Exit due to proactive reset
BTW, is it expected that store instruction accessing heap memory causes multiple SEGVs? I suppose it is a method to detect selfmod code?
There was a bug fixed recently where DR would incorrectly nest a signal when the app did not set SA_NODEFER: #4998. Maybe worth testing w/ that fix if the issue seems to involve DR nesting and the app not handling nesting.
It makes sense if the fix is in build newer than DynamoRIO-Linux-8.0.18611-1. Is it?
It is in 8.0.18824
Second thought: DR uses SIGUSR2 for suspending threads. Maybe try swapping that to SIGUSR1 just to see if the issue involves a bug in DR's attempt to separate its own use from the app's. Already tried to change suspend/resume signal in JVM (to SIGPROF=27) and still observing the same behavior.
Try a non-itimer-associated signal? Thinking of #5017.
If neither of those -- is ithe issue in app SIMD state discrepancy between signal queueing and delivery?
It looks so. Before SIGUSR2s delivery the app SIMD state is valid and after that it is nullified. Diff in handling second SUGUSR2 looks like this:
handle_suspend_signal: suspended now translate_from_synchall_to_dispatch: being translated from 0x00007fcc63b8322d handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fcc63b8322d
That looks like DR treating a SIGUSR2 as coming from itself rather than the app.
save_fpstate thread_set_self_context: pc=0x00007fcc1f683d71 full sigcontext Exit due to proactive reset
I would again disable reset to keep things simple. A reset will use SIGUSR2 to suspend all the threads.
BTW, is it expected that store instruction accessing heap memory causes multiple SEGVs? I suppose it is a method to detect selfmod code?
A store accessing code-containing memory: yes. DR's invariant is that all code is either read-only or sandboxed, so it keeps it read-only and handles the fault on a write by the app. There is a threshold of fault instances at which point it will bail on using page protection and switch to sandboxing.
There was a bug fixed recently where DR would incorrectly nest a signal when the app did not set SA_NODEFER: #4998. Maybe worth testing w/ that fix if the issue seems to involve DR nesting and the app not handling nesting.
It makes sense if the fix is in build newer than DynamoRIO-Linux-8.0.18611-1. Is it?
It is in 8.0.18824
With this build that memory corruption have not reproduced since yesterday. Continuing to reproduce other issues.
Second thought: DR uses SIGUSR2 for suspending threads. Maybe try swapping that to SIGUSR1 just to see if the issue involves a bug in DR's attempt to separate its own use from the app's. Already tried to change suspend/resume signal in JVM (to SIGPROF=27) and still observing the same behavior.
Try a non-itimer-associated signal? Thinking of #5017.
Clarified that our scenario JVM doesn't use this signal either.
If neither of those -- is ithe issue in app SIMD state discrepancy between signal queueing and delivery?
It looks so. Before SIGUSR2s delivery the app SIMD state is valid and after that it is nullified. Diff in handling second SUGUSR2 looks like this: handle_suspend_signal: suspended now translate_from_synchall_to_dispatch: being translated from 0x00007fcc63b8322d handle_suspend_signal: awake now master_signal_handler 12 returning now to 0x00007fcc63b8322d
That looks like DR treating a SIGUSR2 as coming from itself rather than the app.
Correct. Logs say that some other app thread translating futex syscall sends these two SIGUSR2s in a row.
save_fpstate thread_set_self_context: pc=0x00007fcc1f683d71 full sigcontext Exit due to proactive reset
I would again disable reset to keep things simple. A reset will use SIGUSR2 to suspend all the threads.
BTW, is it expected that store instruction accessing heap memory causes multiple SEGVs? I suppose it is a method to detect selfmod code?
A store accessing code-containing memory: yes. DR's invariant is that all code is either read-only or sandboxed, so it keeps it read-only and handles the fault on a write by the app. There is a threshold of fault instances at which point it will bail on using page protection and switch to sandboxing.
Ok. Clear. JVM generates a lot of dynamic code regions. Some part of that regions can then be moved around in memory as well as patched in place (relocations, optimizations like https://en.wikipedia.org/wiki/Inline_caching). Explicit notification of DynamoRIO framework by JVM about its dynamic code layout change could possibly simplify complexity of DynamoRIO implementation for some tricky corner cases as well as reduce runtime overhead in general (should be measured though).
Explicit notification of DynamoRIO framework by JVM about its dynamic code layout change could possibly simplify complexity of DynamoRIO implementation for some tricky corner cases as well as reduce runtime overhead in general (should be measured though).
Yes, we had an academic paper on this and a branch in the code base: but there have not been resources to merge the branch into the mainline. See https://dynamorio.org/page_jitopt.html
Hi @derekbruening. I catched one more strange behaviour under DynamoRIO. We have crashes inside java under dynamorio (they are herppened less with -debug key) like SIGSEGV (0xb) at pc=0xffffffffffffffff, pid=1512988, tid=0x00007f8d7a0de700 I've tried to understanf where is this magic value from. Tried to replace all -1 defines to unique ones I've replaced
And got SIGSEGV (0xb) at pc=0xffffffffffffffc1, pid=1512988, tid=0x00007f8d7a0de700 So, magic value is FAKE_TAG. Look like we try to remove fragment from inderect branch table (ftable->table[hindex].tag_fragment = FAKE_TAG;) but someone tried to use this fragment. Maybe you have ideas how that could happen? What could be wrong here? How could FAKE_TAG be the next execution instruction pc? Thanks, Kirill
FAKE_TAG should never be the target of execution: the target_delete IBL entry is used on IBL removal; a special handler handles NULL. The tag_fragment field is the app pc, not the code cache pc, so it is never an execution target. So this does not make sense to me. Ideally we could enable LBR and read the last N branches from within gdb to see exactly how -1 was reached.
FAKE_TAG should never be the target of execution: the target_delete IBL entry is used on IBL removal; a special handler handles NULL. The tag_fragment field is the app pc, not the code cache pc, so it is never an execution target. So this does not make sense to me. Ideally we could enable LBR and read the last N branches from within gdb to see exactly how -1 was reached.
Did you mean gdb could setup and read lbr msrs? Or how does gdb read them? Kirill
FAKE_TAG should never be the target of execution: the target_delete IBL entry is used on IBL removal; a special handler handles NULL. The tag_fragment field is the app pc, not the code cache pc, so it is never an execution target. So this does not make sense to me. Ideally we could enable LBR and read the last N branches from within gdb to see exactly how -1 was reached.
Did you mean gdb could setup and read lbr msrs? Or how does gdb read them?
No just talking out loud about a wishlist feature for a convenient way to enable LBR recording and then read it back, especially for this case where you're targeting bug discovery and the slight overhead from LBR won't matter. I don't know of a ready-made tool; we'd have to write our own. We could add it inside DR and dump the last LBR buffer at an unhandled app signal -- if we had kernel support to save it at signal time or sthg.
FAKE_TAG should never be the target of execution: the target_delete IBL entry is used on IBL removal; a special handler handles NULL. The tag_fragment field is the app pc, not the code cache pc, so it is never an execution target. So this does not make sense to me. Ideally we could enable LBR and read the last N branches from within gdb to see exactly how -1 was reached.
Below is a piece of log around executing faulty target 0xffffffffffffffc1. Log reports it is not in cache.
vm_area_add_fragment for F163450(0x00007f2bc90444d0)
linking new fragment F163450(0x00007f2bc90444d0)
transferring incoming links from existing future frag, flags=0x01004101
Freeing future fragment 0x00007f2bc90444d0
linking incoming links for F163450(0x00007f2bc90444d0)
linking outgoing links for F163450(0x00007f2bc90444d0)
future-linking F163450(0x00007f2bc90444d0).0x00007f2dd3728b5c -> (0x00007f2bc90446da)
future-linking F163450(0x00007f2bc90444d0).0x00007f2dd3728b62 -> (0x00007f2bc90444e9)
Fragment 163450, tag 0x00007f2bc90444d0, flags 0x9000630, shared, size 47:
-------- indirect branch target entry: --------
0x00007f2dd3728b38 67 65 48 a1 00 00 00 addr32 mov %gs:0x00, %rax
00
-------- prefix entry: --------
0x00007f2dd3728b40 65 48 8b 0c 25 10 00 mov %gs:0x10, %rcx
00 00
-------- normal entry: --------
0x00007f2dd3728b49 49 8b 36 mov (%r14), %rsi
0x00007f2dd3728b4c 48 8b 7e 08 mov 0x08(%rsi), %rdi
0x00007f2dd3728b50 8b bf a4 00 00 00 mov 0xa4(%rdi), %edi
0x00007f2dd3728b56 f7 c7 00 00 00 40 test %edi, $0x40000000
0x00007f2dd3728b5c 0f 84 7a 27 fb ff jz $0x00007f2dd36db2dc
0x00007f2dd3728b62 e9 75 27 fb ff jmp $0x00007f2dd36db2dc
-------- exit stub 0: -------- <target: 0x00007f2bc90446da> type: jmp/jcc
0x00007f2dd36db2dc 67 65 48 a3 00 00 00 addr32 mov %rax, %gs:0x00
00
0x00007f2dd36db2e4 48 b8 f0 22 b1 d7 2b mov $0x00007f2bd7b122f0, %rax
7f 00 00
0x00007f2dd36db2ee e9 0d 7b 4d ff jmp $0x00007f2dd2bb2e00 <fcache_return>
-------- exit stub 1: -------- <target: 0x00007f2bc90444e9> type: fall-through/speculated/IAT
0x00007f2dd36db2dc 67 65 48 a3 00 00 00 addr32 mov %rax, %gs:0x00
00
0x00007f2dd36db2e4 48 b8 f0 22 b1 d7 2b mov $0x00007f2bd7b122f0, %rax
7f 00 00
0x00007f2dd36db2ee e9 0d 7b 4d ff jmp $0x00007f2dd2bb2e00 <fcache_return>
Entry into F163450(0x00007f2bc90444d0).0x00007f2dd3728b49 (shared)
priv_mcontext_t @0x00007f2bd2c2a1c0
xax = 0x00007f2e1508b688
xbx = 0x00000000000000e8
xcx = 0x00000000000042b1
xdx = 0x00007f2bd12ba428
xsi = 0x0000000000000008
xdi = 0x00007f2bcc00e000
xbp = 0x00007f2e1508b6c8
xsp = 0x00007f2e1508b688
r8 = 0x00007f2bcc00e000
r9 = 0x00007f2e16085179
r10 = 0x00007f2e16915420
r11 = 0x0000000000000000
r12 = 0x0000000000000001
r13 = 0x00007f2bd11da450
r14 = 0x00007f2e1508b6d8
r15 = 0x00007f2bcc00e000
ymm0= 0xabababababababababababababababab00000000000000000000000000000000
ymm1= 0x000000000000000000000000ff00000000000000000000000000000000000000
ymm2= 0x20736f7065203d3c2928646e6166202900000000000000000000000000000000
ymm3= 0x706d6f4354656c694c6b7361006b636f00000000000000000000000000000000
ymm4= 0x2f2f2f2f2f2f2f2f2f2f2f2f2f2f2f2f00000000000000000000000000000000
ymm5= 0x02000000010000001a0000000200000000000000000000000000000000000000
ymm6= 0xa71400b82c4d1200bd0419120359130000000000000000000000000000000000
ymm7= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm8= 0x6f6420004300656e706165482d6a624f00000000000000000000000000000000
ymm9= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm10= 0x04123e0008131402203e18151204020200000000000000000000000000000000
ymm11= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm12= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm13= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm14= 0x0000000000000000000000000000000000000000000000000000000000000000
ymm15= 0xcafebabecafebabecafebabecafebabe00000000000000000000000000000000
mxcsr=0x00001fa0
eflags = 0x0000000000000246
pc = 0x00007f2dd3728acd
Entry into F163450(0x00007f2bc90444d0).0x00007f2dd3728b49 (shared)
fcache_enter = 0x00007f2dd2bb2d00, target = 0x00007f2dd3728b49
wherewasi=9, go_native=0, last_exit_reset=0
Exit from F163450(0x00007f2bc90444d0).0x00007f2dd3728b5c (shared)
(target 0x00007f2bc90446da not in cache)
d_r_dispatch: target = 0x00007f2bc90446da
fcache_enter = 0x00007f2dd2bb2d00, target = 0x00007f2dd3728b49
fragment_prepare: remove F163450(0x00007f2bc90444d0) from indjmp_bb[17616] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163442(0x00007f2bc903d1e0) from indjmp_bb[53728] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163434(0x00007f2bc903d210) from indjmp_bb[53776] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163430(0x00007f2bc902db80) from indjmp_bb[56192] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163412(0x00007f2bc903ef10) from indjmp_bb[61200] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163408(0x00007f2bc90340a7) from indjmp_bb[16551] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163406(0x00007f2bc902a78f) from indjmp_bb[42895] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163402(0x00007f2bc902da70) from indjmp_bb[55920] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163400(0x00007f2bc9008274) from indjmp_bb[33396] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163390(0x00007f2bc903a450) from indjmp_bb[42064] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163378(0x00007f2bc9029ed0) from indjmp_bb[40656] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163376(0x00007f2bc902e047) from indjmp_bb[57415] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163364(0x00007f2bc9021632) from ret_bb[5682] (table addr 0x00007f2bd30cf080), set to 0x00007f2dd2bb363b
fragment_prepare: remove F163358(0x00007f2bc9021569) from ret_bb[5481] (table addr 0x00007f2bd30cf080), set to 0x00007f2dd2bb363b
fragment_prepare: remove F163352(0x00007f2bc9021670) from ret_bb[5744] (table addr 0x00007f2bd30cf080), set to 0x00007f2dd2bb363b
fragment_prepare: remove F163350(0x00007f2bc9021648) from ret_bb[5705] (table addr 0x00007f2bd30cf080), set to 0x00007f2dd2bb363b
fragment_prepare: remove F163339(0x00007f2bc902b54f) from indjmp_bb[46415] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
wherewasi=9, go_native=0, last_exit_reset=0
Exit from sourceless ibl: bb jmp*
(target 0xffffffffffffffc1 not in cache)
Thread 3049040 waiting for flush (flusher is 3049066 @flushtime 3973)
fragment_prepare: remove F163337(0x00007f2bc902b68f) from indjmp_bb[46735] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
thread 3049040 waiting for event 0x00007f2bd2c2cd00
fragment_prepare: remove F163335(0x00007f2bc902b5f0) from indjmp_bb[46576] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163333(0x00007f2bc902ca27) from indjmp_bb[51751] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163323(0x00007f2bc9033267) from indjmp_bb[12903] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163321(0x00007f2bc902c9a7) from indjmp_bb[51623] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163319(0x00007f2bc902f207) from indjmp_bb[61959] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163317(0x00007f2bc9029e2f) from indjmp_bb[40495] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163315(0x00007f2bc904392f) from indjmp_bb[14639] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163313(0x00007f2bc902b690) from indjmp_bb[46736] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163311(0x00007f2bc902f107) from indjmp_bb[61703] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163309(0x00007f2bc902f307) from indjmp_bb[62215] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163307(0x00007f2bc904388f) from indjmp_bb[14479] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163305(0x00007f2bc902ef07) from indjmp_bb[61191] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163303(0x00007f2bc9029f6f) from indjmp_bb[40815] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163301(0x00007f2bc902e247) from indjmp_bb[57927] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163299(0x00007f2bc902b5ef) from indjmp_bb[46575] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163295(0x00007f2bc902a6ef) from indjmp_bb[42735] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163285(0x00007f2bc90340a0) from indjmp_bb[16544] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163273(0x00007f2bc902a82f) from indjmp_bb[43055] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163271(0x00007f2bc9043890) from indjmp_bb[14480] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163259(0x00007f2bc9036430) from indjmp_bb[25648] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163257(0x00007f2bc902f430) from indjmp_bb[62512] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163255(0x00007f2bc902c5c7) from indjmp_bb[50631] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163253(0x00007f2bc9037527) from indjmp_bb[29991] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163241(0x00007f2bc9031647) from indjmp_bb[5703] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163239(0x00007f2bc902b550) from indjmp_bb[46416] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163209(0x00007f2bc9020a00) from indjmp_bb[2560] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163193(0x00007f2bc903cb4f) from indjmp_bb[52047] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163189(0x00007f2bc9043900) from indjmp_bb[14593] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163187(0x00007f2bc90435b0) from indjmp_bb[13744] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163185(0x00007f2bc902caa7) from indjmp_bb[51879] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
fragment_prepare: remove F163183(0x00007f2bc9007cc0) from indjmp_bb[31936] (table addr 0x00007f2bd32e3080), set to 0x00007f2dd2bb3a3b
thread 3049040 finished waiting for event 0x00007f2bd2c2cd00
Thread 3049040 resuming after flush
thread 3049040 (flushtime 3972) walking pending deletion list (was_I_flushed==F0)
Considering #0: 0x00007f2bc9000000..0x00007f2bc9270000 flushtime 3973
dec => ref_count is now 3, flushtime diff is 0
thread 3049040 done walking pending list @flushtime 3973
Flushed 0 frags
fragment_add_ibl_target tag 0xffffffffffffffc1, branch 2, F0
Table ret_bb, table 0x00007f2bd30cf080, mask 0x000000000000ffff
Table indcall_bb, table 0x00007f2bd31d2080, mask 0x00000000000ffff0
Table indjmp_bb, table 0x00007f2bd32e3080, mask 0x000000000000ffff
d_r_dispatch: target = 0xffffffffffffffc1
interp: start_pc = 0xffffffffffffffc1
check_thread_vm_area: pc = 0xffffffffffffffc1
application tried to execute from unreadable 0xffffffffffffffc1 is_allocated_mem=0 prot=0x0
Thread 3049040 call stack:
0xffffffffffffffc1
frame ptr 0x00007f2e1508b6b0 => parent 0x00007f2e1508b720, 0x00007f2bc9007cc0
frame ptr 0x00007f2e1508b720 => parent 0x00007f2e1508b7b0, 0x00007f2bc9008274
frame ptr 0x00007f2e1508b7b0 => parent 0x00007f2e1508b830, 0x00007f2bc90006a0
frame ptr 0x00007f2e1508b830 => parent 0x00007f2e1508b980, 0x00007f2e15a624fa
frame ptr 0x00007f2e1508b980 => parent 0x00007f2e1508b9c0, 0x00007f2e15d7cb32
frame ptr 0x00007f2e1508b9c0 => parent 0x00007f2e1508b9f0, 0x00007f2e15a61e3c
frame ptr 0x00007f2e1508b9f0 => parent 0x00007f2e1508bbd0, 0x00007f2e15ada344
frame ptr 0x00007f2e1508bbd0 => parent 0x00007f2e1508bda0, 0x00007f2e15af0f7f
frame ptr 0x00007f2e1508bda0 => parent 0x00007f2e1508be50, 0x00007f2e16b1df1b
frame ptr 0x00007f2e1508be50 => parent 0x0000000000000000, 0x00007f2e16d3a299
SYSLOG_WARNING: Application tried to execute from unreadable memory 0xffffffffffffffc1.
This may be a result of an unsuccessful attack or a potential application vulnerability.
record_pending_signal(11) from DR at pc 0xffffffffffffffc1
action is not SIG_IGN
retaddr = 0x00007f2e16d44a20
copy_frame_to_pending from 0x00007f2bd30a86d0
sigcontext:
gs=0x0000
fs=0x0000
xdi=0x00007f2bcc00e000
xsi=0x00007f2e1508b668
xbp=0x00007f2e1508b6b0
xsp=0x00007f2e1508b668
xbx=0x000000000000001b
xdx=0x0000000000000000
xcx=0x000000000000001f
xax=0x00000000ffffffff
r8=0x00007f2bcc00e000
r9=0x00007f2e16085179
r10=0x00007f2e16912c20
r11=0x0000000000000000
r12=0x0000000000000001
r13=0x00007f2bd15d9553
r14=0x00007f2e1508b6d8
r15=0x00007f2bcc00e000
trapno=0x0000000000000000
err=0x0000000000000000
xip=0xffffffffffffffc1
cs=0x0033
eflags=0x0000000000000206
cwd=0x0000000000000000
swd=0x0000000000000000
twd=0x0000000000000000
fop=0x0000000000000000
rip=0x0000000000000000
rdp=0x0000000000000000
mxcsr=0x0000000000000000
mxcsr_mask=0x0000000000000000
st0 = 0x00000000000000000000000000000000
st1 = 0x00000000000000000000000000000000
st2 = 0x00000000000000000000000000000000
st3 = 0x00000000000000000000000000000000
st4 = 0x00000000000000000000000000000000
st5 = 0x00000000000000000000000000000000
st6 = 0x00000000000000000000000000000000
st7 = 0x00000000000000000000000000000000
transfer_to_dispatch: pc=0xffffffc1, xsp=0x00007f2e1508b668, on-initstack=0
wherewasi=6, go_native=0, last_exit_reset=0
Exit from asynch event
receive_pending_signal
receiving signal 11
get_sigstack_frame_ptr: using app xsp 0x00007f2e1508b668
placing frame at 0x00007f2e1508afa8
execute_handler_from_dispatch for signal 11
original sigcontext 0x00007f2bd30bf070:
gs=0x0000
fs=0x0000
xdi=0x00007f2bcc00e000
FAKE_TAG should never be the target of execution: the target_delete IBL entry is used on IBL removal; a special handler handles NULL. The tag_fragment field is the app pc, not the code cache pc, so it is never an execution target. So this does not make sense to me. Ideally we could enable LBR and read the last N branches from within gdb to see exactly how -1 was reached.
Below is a piece of log around executing faulty target 0xffffffffffffffc1. Log reports it is not in cache.
Exit from sourceless ibl: bb jmp* (target 0xffffffffffffffc1 not in cache)
Ah ok I interpreted the earlier comments as the actual PC being 0xffffffffffffffc1 from a branch going there and the goal was to figure out how control got there (hence the LBR mention): instead it's DR thinking the app wants to go to 0xffffffffffffffc1 which is a different thing. Presumably a bug somewhere in the deletion ordering as you suggested.
Hi, @derekbruening.
Looks like next_tag is changed in dynamic generated code dynamorio/core/arch/x86/emit_utils.c
function emit_indirect_branch_lookup:
As I understand we save address of frame_tag from lookup table to SCRATCH_REG2 here
APP(&ilist, XINST_CREATE_load(dcontext, opnd_create_reg(SCRATCH_REG1), OPND_CREATE_MEMPTR(SCRATCH_REG2, FRAGMENT_TAG_OFFS)));
if I save it my next_tag duplicate in that point, it looks ok here (smthg like 0x7f52d6c3388b)
(APP(&ilist, SAVE_TO_DC(dcontext, SCRATCH_REG2, NEXT_TAG_OFFSET_KUHANOV));)
~ 'app_pc next_tag_kuhanov' in struct _dcontext_t.
But when we save SCRATCH_REG2 to the next_tag. it becames 0xffffffffffffffc1
APP(&ilist,SAVE_TO_DC(dcontext, SCRATCH_REG2, NEXT_TAG_OFFSET));
Who could change SCRATCH_REG2? next_tag_kuhanov value is not removed from ibl table if we look at logs. Maybe you have ideas here?
Another question: is it possible to dump dissassemble for generetad code? what is log option here? Thanks, Kirill
As I understand we save address of frame_tag from lookup table to SCRATCH_REG2 here
APP(&ilist, XINST_CREATE_load(dcontext, opnd_create_reg(SCRATCH_REG1), OPND_CREATE_MEMPTR(SCRATCH_REG2, FRAGMENT_TAG_OFFS)));
You mean to SCRATCH_REG1? Which part of the IBL is that: target_delete? The comment says it's moved from REG1 to REG2 later. Does that not happen? May be easier to look at the printed gencode.
Another question: is it possible to dump dissassemble for generetad code? what is log option here?
-loglevel 3
or -gendump
: https://github.com/DynamoRIO/dynamorio/blob/master/core/arch/arch.c#L566
As I understand we save address of frame_tag from lookup table to SCRATCH_REG2 here
APP(&ilist, XINST_CREATE_load(dcontext, opnd_create_reg(SCRATCH_REG1), OPND_CREATE_MEMPTR(SCRATCH_REG2, FRAGMENT_TAG_OFFS)));
You mean to SCRATCH_REG1? Which part of the IBL is that: target_delete? The comment says it's moved from REG1 to REG2 later. Does that not happen? May be easier to look at the printed gencode.
Another question: is it possible to dump dissassemble for generetad code? what is log option here?
-loglevel 3
or-gendump
: https://github.com/DynamoRIO/dynamorio/blob/master/core/arch/arch.c#L566
Looks like we have FAKE_TAG in rcx register from beggining. Try to find it in ib lookup table, jump to fragment_not_found label and _save to dc nexttag. Why could we search FAKE_TAG in indirect_branch_lookup_routine call? Thx, Kirill
Dump code. TAG 0x00007f4c29edc980 +0 m4 @0x00007f4a29f29010 65 48 a3 00 00 00 00 mov %rax -> %gs:0x00[8byte] 00 00 00 00 +11 m4 @0x00007f4a29f28f90 9f lahf -> %ah +12 m4 @0x00007f4a29f28f10 0f 90 c0 seto -> %al +15 m4 @0x00007f4a29f28e90
+123 m4 @0x00007f4a29f28280 48 8b 19 mov (%rcx)[8byte] -> %rbx
Fail place is +123 m4 @0x00007f4a29f28280 48 8b 19 mov (%rcx)[8byte] -> %rbx here we move FAKE_TAG to rbx. Looks like something wrong with target_delete_entry Kirill
I believe it's a little easier to read w/ the labels:
shared_trace_ibl_ret:
0x00007fde85618480 65 48 a3 00 00 00 00 mov %rax -> %gs:0x00[8byte]
00 00 00 00
0x00007fde8561848b 9f lahf -> %ah
0x00007fde8561848c 0f 90 c0 seto -> %al
shared_trace_cmp_ret:
0x00007fde8561848f 65 48 89 1c 25 08 00 mov %rbx -> %gs:0x08[8byte]
00 00
0x00007fde85618498 48 8b d9 mov %rcx -> %rbx
0x00007fde8561849b 65 48 23 0c 25 28 00 and %gs:0x28[8byte] %rcx -> %rcx
00 00
0x00007fde856184a4 48 03 c9 add %rcx %rcx -> %rcx
0x00007fde856184a7 48 03 c9 add %rcx %rcx -> %rcx
0x00007fde856184aa 48 03 c9 add %rcx %rcx -> %rcx
0x00007fde856184ad 48 03 c9 add %rcx %rcx -> %rcx
0x00007fde856184b0 65 48 03 0c 25 30 00 add %gs:0x30[8byte] %rcx -> %rcx
00 00
0x00007fde856184b9 48 39 19 cmp (%rcx)[8byte] %rbx
0x00007fde856184bc 75 0c jnz $0x00007fde856184ca
0x00007fde856184be 65 48 8b 1c 25 08 00 mov %gs:0x08[8byte] -> %rbx
00 00
0x00007fde856184c7 ff 61 08 jmp 0x08(%rcx)[8byte]
0x00007fde856184ca 48 83 39 00 cmp (%rcx)[8byte] $0x0000000000000000
0x00007fde856184ce 74 06 jz $0x00007fde856184d6
0x00007fde856184d0 48 8d 49 10 lea 0x10(%rcx) -> %rcx
0x00007fde856184d4 eb e3 jmp $0x00007fde856184b9
0x00007fde856184d6 48 83 79 08 01 cmp 0x08(%rcx)[8byte] $0x0000000000000001
0x00007fde856184db 75 1a jnz $0x00007fde856184f7
0x00007fde856184dd 65 48 8b 0c 25 30 00 mov %gs:0x30[8byte] -> %rcx
00 00
0x00007fde856184e6 e9 ce ff ff ff jmp $0x00007fde856184b9
shared_delete_trace_ibl_ret:
0x00007fde856184eb 65 48 89 1c 25 08 00 mov %rbx -> %gs:0x08[8byte]
00 00
0x00007fde856184f4 48 8b 19 mov (%rcx)[8byte] -> %rbx
0x00007fde856184f7 48 8b cb mov %rbx -> %rcx
0x00007fde856184fa 65 48 8b 1c 25 08 00 mov %gs:0x08[8byte] -> %rbx
00 00
shared_trace_cmp_unlinked_ret:
0x00007fde85618503 04 7f add $0x7f %al -> %al
0x00007fde85618505 9e sahf %ah
0x00007fde85618506 65 48 a1 00 00 00 00 mov %gs:0x00[8byte] -> %rax
00 00 00 00
shared_unlinked_trace_ibl_ret:
0x00007fde85618511 65 48 89 3c 25 18 00 mov %rdi -> %gs:0x18[8byte]
00 00
0x00007fde8561851a 65 48 8b 3c 25 20 00 mov %gs:0x20[8byte] -> %rdi
00 00
0x00007fde85618523 48 89 47 38 mov %rax -> 0x38(%rdi)[8byte]
0x00007fde85618527 48 89 8f 38 09 00 00 mov %rcx -> 0x00000938(%rdi)[8byte]
0x00007fde8561852e 48 b8 c4 bb 75 c9 de mov $0x00007fdec975bbc4 -> %rax
7f 00 00
0x00007fde85618538 48 8b 4f 38 mov 0x38(%rdi)[8byte] -> %rcx
0x00007fde8561853c 65 48 89 0c 25 00 00 mov %rcx -> %gs:0x00[8byte]
00 00
0x00007fde85618545 65 48 8b 0c 25 10 00 mov %gs:0x10[8byte] -> %rcx
00 00
0x00007fde8561854e 65 48 8b 3c 25 18 00 mov %gs:0x18[8byte] -> %rdi
00 00
0x00007fde85618557 e9 e4 fd ff ff jmp $0x00007fde85618340
The target_delete path (shared_delete_trace_ibl_ret
here for thread-shared, trace (not bb), and returns) is only reached when a fragment is deleted, another thread has already found its entry in the lookup table, and a thread doing the deleting has modified that entry to have a target cache PC of this target_delete entry point in order to exit the cache.
What I'm wondering about is probably also what you are talking about: in the code in fragment_prepare_for_removal_from_table()
, the deleting thread also changes the tag of the lookup table entry to FAKE_TAG. But that is where the target_delete path gets the tag from as it's lost the original. I agree it seems wrong -- puzzling over whether I'm missing something...
If you disable that line of code: not sure what the consequences are. Might fix this bug but cause consistency complaints about 2 identical tags until the entry is fully removed?
fragment_prepare_for_removal_from_table
Hi, @derekbruening.
What line did you mean for disabling? this one?
ftable->table[hindex].tag_fragment = FAKE_TAG;
Looks like carsh is dissapeared but the overead is increased too (~9 secs vs ~7 secs)
Thanks, Kirill
What line did you mean for disabling? this one?
ftable->table[hindex].tag_fragment = FAKE_TAG;
Yes.
Looks like carsh is dissapeared but the overead is increased too (~9 secs vs ~7 secs)
Are there other crashes/problems or does this make everything work?
Maybe it's slower b/c it keeps finding the deleted one instead of the replacement? Seems like the target_delete tag retrieval scheme is just fundamentally flawed, to assume it can get it from the lookup table entry. Hmm. Wondering if there's something we're missing, something that changed after that was implemented, or if it was a mistake from the beginning.
Not remembering how final freeing works: once every thread has exited the cache it will finally pull the entry completely?
But aren't the indirect branch target tables thread-private by default (-shared_trace_ibt_tables
)? So there should just be one exit and then the entry should be removed?
What line did you mean for disabling? this one?
ftable->table[hindex].tag_fragment = FAKE_TAG;
Yes.
Looks like carsh is dissapeared but the overead is increased too (~9 secs vs ~7 secs)
Are there other crashes/problems or does this make everything work?
No, this is not the last issue. ))
But aren't the indirect branch target tables thread-private by default (
-shared_trace_ibt_tables
)? So there should just be one exit and then the entry should be removed?
Looks like yes, table is per-thread (-shared_trace_ibt_tables is not set)
Hi , @derekbruening! Right now, we assume that everything does wrong , because tag_fragment = FAKE_TAG , so we added in fragment_entry_pc bool variable , which indicate that fragment is prepared for delete. In this way tag_fragment won't be bad and fragment will be delete,but we see DR craches. Why we can't do that?
As well we can't find ,where DR starts to execute tag_fragment. Which logmask we should help us? Or maybe you know the function/file , where DR do that.
Anothre starnge thing that we jump on this tag if (%rcx) includes FAKE_TAG only shared_delete_trace_ibl_ret: 0x00007fde856184eb 65 48 89 1c 25 08 00 mov %rbx -> %gs:0x08[8byte] 00 00 0x00007fde856184f4 48 8b 19 mov (%rcx)[8byte] -> %rbx
In case of success runs we don't jump on this point Kirill
so we added in fragment_entry_pc bool variable , which indicate that fragment is prepared for delete. In this way tag_fragment won't be bad and fragment will be delete,but we see DR craches. Why we can't do that?
fragment_entry_t's size and layout is used by generated code, so unless you also changed the generated code, purely changing the C struct would indeed cause problems. If you did change the gencode: we want good alignment so we can't pack the bool; so it increases the size. As long as it's increasing the size, I'd double-store the tag instead of a bool and set the 1st one to FAKE_TAG so lookups fail and use the 2nd to recover on the target_delete path. But first I'd want to understand the slowdown, and revisit the shared fragment deletion steps and see if target_delete could be removed altogether (since a thread already in there has to be handled anyway: so maybe only the tag should be changed; maybe missing sthg though).
As well we can't find ,where DR starts to execute tag_fragment. Which logmask we should help us? Or maybe you know the function/file , where DR do that.
Not sure which path you mean: on an indirect branch table hit, the gencode itself jumps to the start_pc_fragment in the code cache. On a miss (which target_delete is: it shares the miss/exit path) it propagates the tag_fragment back to dispatch as dcontext->next_tag.
Compare VTune resuts for 2 runs: without and with FAKE_TAG
<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
Function / Call Stack | CPU Time: Difference | CPU Time: W/o FAKE_TAG | CPU Time: W FAKE_TAG -- | -- | -- | -- hashtable_fragment_lookup | 0.227566 | 0.330823 | 0.103257 mutex_testlock | 0.176439 | 0.205511 | 0.0290723 atomic_dec_becomes_zero | 0.151376 | 0.168419 | 0.0170424 atomic_dec_and_test | 0.117292 | 0.139346 | 0.0220548 atomic_compare_exchange_int | 0.110274 | 0.131327 | 0.0210523 atomic_compare_exchange_int | 0.075187 | 0.0902243 | 0.0150374 is_thread_tls_initialized | 0.071177 | 0.0832069 | 0.0120299 dispatch_exit_fcache_stats | 0.063157 | 0.071177 | 0.00801994 enter_nolinking | 0.06015 | 0.0641595 | 0.00400997 fragment_add_ibl_target | 0.054135 | 0.0541346 | 0 fragment_lookup_fine_and_coarse | 0.049122 | 0.0551371 | 0.00601496 check_flush_queue | 0.047117 | 0.0491221 | 0.00200499 get_thread_private_dcontext | 0.046115 | 0.0561396 | 0.0100249 fragment_lookup_type | 0.04411 | 0.0441097 | 0 hashtable_ibl_lookup | 0.0401 | 0.0411022 | 0.00100249 atomic_dec_and_test | 0.0401 | 0.0601496 | 0.0200499 dispatch_enter_dynamorio | 0.0401 | 0.0541346 | 0.0140349 d_r_dispatch | 0.039097 | 0.0441097 | 0.00501246 d_r_read_lock | 0.030075 | 0.0340848 | 0.00400997 enter_couldbelinking | 0.029072 | 0.0300748 | 0.00100249 safe_read_tls_magic | 0.025062 | 0.0310773 | 0.00601496
we are seeing that SPECjvm 2008 runs won't even start the warm-up phase when launched with drrun. Typically specjvm runs may look like this:
with drrun we never get to this first message. I do see two threads running for short period but not convinced runs is successful since it never gets to warm-up and execution phase of the test. Although memory utilization is roughly 11GB which is quite high for sparse.small
attached log debuglevel 3 for the java pid java.log.zip
java.0.59824.zip