Open rkgithubs opened 5 years ago
if (instr_is_exclusive_store(instr)) count++;
It is possible to have non-strictly-paired one-load-to-one-store, and to have dynamic paths that only execute one and not the other (and DR does try to handle all that): but I assumed real app code would never do that. Do you have an example of code that does not dynamically execute strict ldex-stex pairs? Or maybe it's a race with another thread coming in for its ldex in between the ldex-stex of the first thread.
Added dumping all ilists in mangle_exclusive_monitor_op. Got another hang when a few cpus are occupied by threads on 100%. in this case, the last records in log
TAG 0x0000fff921ea44c8
+0 L3 @0x0000fff921ea6530 52800020 movz $0x0001 lsl $0x00 -> %w0
+4 L3 @0x0000fff921ea44c8 885ffe81 ldaxr (%x20)[4byte] -> %w1
+8 L3 @0x0000fff921ea4298 7100003f subs %w1 $0x0000 lsl $0x00 -> %wzr
+12 L3 @0x0000fff921ea5a88 54000061 b.ne $0x0000ffffa5515410
+16 L4 @0x0000fff921ea76b0 14000000 b $0x0000ffffa5515408
END 0x0000fff921ea44c8
COUNT = 145 load_store = 5
TAG 0x0000fff921ea5a88
+0 L3 @0x0000fff921ea76b0 52800020 movz $0x0001 lsl $0x00 -> %w0
+4 L3 @0x0000fff921ea5a88 885ffe81 ldaxr (%x20)[4byte] -> %w1
+8 L3 @0x0000fff921ea4298 7100003f subs %w1 $0x0000 lsl $0x00 -> %wzr
+12 L3 @0x0000fff921ea7930 54000061 b.ne $0x0000ffffa5515410
+16 L4 @0x0000fff921fb6900 14000000 b $0x0000ffffa5515408
END 0x0000fff921ea5a88
Not sure is it possible?
Trying to catch original issue with hanging on futexes. Kirill
Hi @derekbruening . Looks like this is hard to find the count of mangling instructions. Sometimes the hang is happened when we mangel 150, sometimes - 250. I have workload that could be find 99 times from 100 and one run is fail. I printed all blocks from mangle_exclusive_monitor_op and caught that hang happened when we have the same blocks in output. All good runs didn't have repeated bb. for example, fail log includes
121 TAG 0x0000fff9293b8108
+0 L3 @0x0000fff9293b8108 88027e60 stxr %w0 -> (%x19)[4byte] %w2
+4 L3 @0x0000fff9293b8308 35ffff82 cbnz $0x0000ffffaceeed44 %w2
+8 L4 @0x0000fff9293b7b88 14000000 b $0x0000ffffaceeed58
END 0x0000fff9293b8108
289 TAG 0x0000fff929a116b0
+0 L3 @0x0000fff929a116b0 88027e60 stxr %w0 -> (%x19)[4byte] %w2
+4 L3 @0x0000fff929a10cf0 35ffff82 cbnz $0x0000ffffaceeed44 %w2
+8 L4 @0x0000fff92a3cd900 14000000 b $0x0000ffffaceeed58
END 0x0000fff929a116b0
290 TAG 0x0000fff92a3cd900
+0 L3 @0x0000fff92a3cd900 88027e60 stxr %w0 -> (%x19)[4byte] %w2
+4 L3 @0x0000fff929a10cf0 35ffff82 cbnz $0x0000ffffaceeed44 %w2
+8 L4 @0x0000fff929a10530 14000000 b $0x0000ffffaceeed58
END 0x0000fff92a3cd900
one more example
187 TAG 0x0000fff8fdc41270
+0 L3 @0x0000fff8fdc41270 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+4 L3 @0x0000fff8fdc40a30 35ffff84 cbnz $0x0000ffff8054a118 %w4
+8 L4 @0x0000fff8fdc42a30 14000000 b $0x0000ffff8054a12c
END 0x0000fff8fdc41270
278 TAG 0x0000fff8fde13ad0
+0 L3 @0x0000fff8fde13ad0 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+4 L3 @0x0000fff8fde13c90 35ffff84 cbnz $0x0000ffff8054a118 %w4
+8 L4 @0x0000fff8fde13810 14000000 b $0x0000ffff8054a12c
END 0x0000fff8fde13ad0
279 TAG 0x0000fff8fde13810
+0 L3 @0x0000fff8fde13810 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+4 L3 @0x0000fff8fde13c90 35ffff84 cbnz $0x0000ffff8054a118 %w4
+8 L4 @0x0000fff8fdc421f0 14000000 b $0x0000ffff8054a12c
END 0x0000fff8fde13810
Thx, Kirill
By themselves those don't look unusual: no XZR, no stolen register x28. You can see our tests here to see if anything is missing: https://github.com/DynamoRIO/dynamorio/blob/master/suite/tests/client-interface/ldstex.c#L278
Does the corresponding ldxr
look unusual?
The harder question to answer is whether this code sequence requires the monitor rather than the compare-and-swap we turn it into. See https://dynamorio.org/page_ldstex.html#autotoc_md195. Ideally it would be tracked back to the Java source code to help figure that out.
strange thing is that good runs does not have such patterns when we have 3 times mangling for the similar blocks. :( but ALL my fails have such: the blocks could be different, could be load, could be store (a couple previous my comments) I'm not clear why are they called? and why there are somehing like this in good runs
Does the corresponding
ldxr
look unusual? Hi, @derekbruening It is absent at all how it looks in good caseCOUNT = 91 tid = 92215 load_store = 4. TAG 0x0000fff925bf7208 +0 L3 @0x0000fff925bf8cc8 d10083ff sub %sp $0x0020 lsl $0x00 -> %sp +4 L3 @0x0000fff925bf9bf0 f9000fe0 str %x0 -> +0x18(%sp)[8byte] +8 L3 @0x0000fff925bf7570 f9000be1 str %x1 -> +0x10(%sp)[8byte] +12 L3 @0x0000fff925bf6e70 f90007e2 str %x2 -> +0x08(%sp)[8byte] +16 L3 @0x0000fff925bf7308 f94007e1 ldr +0x08(%sp)[8byte] -> %x1 +20 L3 @0x0000fff925bf7d30 f9400fe2 ldr +0x18(%sp)[8byte] -> %x2 +24 L3 @0x0000fff925bf9370 f9400be0 ldr +0x10(%sp)[8byte] -> %x0 +28 L3 @0x0000fff925bf7208 c85f7c03 ldxr (%x0)[8byte] -> %x3 +32 L3 @0x0000fff925bf9a30 eb01007f subs %x3 %x1 lsl $0x00 -> %xzr +36 L3 @0x0000fff925bf7a30 54000061 b.ne $0x0000ffffa850112c +40 L4 @**0x0000fff925bf8270** 14000000 b $0x0000ffffa8501124 END 0x0000fff925bf7208
COUNT = 92 tid = 92215 load_store = 3 STORE TAG 0x0000fff925bf8270 +0 L3 @0x0000fff925bf8270 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4 +4 L3 @0x0000fff925bf7a30 35ffff84 cbnz $0x0000ffffa8501118 %w4 +8 L4 @0x0000fff925bf9a30 14000000 b $0x0000ffffa850112c END 0x0000fff925bf8270
but for wrong thread this is 1st bb often
for example for tid=**92486**
COUNT = 139 tid = 92486 load_store = 2 STORE TAG 0x0000fff925dc8a40 +0 L3 @0x0000fff925dc8a40 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4 +4 L3 @0x0000fff925bf8970 35ffff84 cbnz $0x0000ffffa8501118 %w4 +8 L4 @0x0000fff925bf8cc8 14000000 b $0x0000ffffa850112c END 0x0000fff925dc8a40
COUNT = 140 tid = 92486 load_store = 1 STORE TAG 0x0000fff925bf8cc8 +0 L3 @0x0000fff925bf8cc8 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4 +4 L3 @0x0000fff925bf8970 35ffff84 cbnz $0x0000ffffa8501118 %w4 +8 L4 @0x0000fff925bfa6e8 14000000 b $0x0000ffffa850112c END 0x0000fff925bf8cc8
Another strange thing for me is that I think that we must have correct order for load-store per thread. (load1-store1, load2-store2 an so on) but logs could include a few loads without store or, as we have on hang, store without load.
Maybe DRIO missed needed blocks on some reason
Kirill
Run another experiment here. I mangled exclusive load-store in case they are both inside the same bb. So, I missed load or store if it is without pair like
+0 L3 @0x0000fff925bf8cc8 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+4 L3 @0x0000fff925bf8970 35ffff84 cbnz $0x0000ffffa8501118 %w4
+8 L4 @0x0000fff925bfa6e8 14000000 b $0x0000ffffa850112c
Maybe we need skip splitting on branch call inside monitor regions. Kirill
Run another experiment here. I mangled exclusive load-store in case they are both inside the same bb. So, I missed load or store if it is without pai
What was the result of the experiment?
Unpaired loads and unpaired stores are in the regression test: https://github.com/DynamoRIO/dynamorio/blob/master/suite/tests/client-interface/ldstex.c#L625
The sample expansion at https://dynamorio.org/page_ldstex.html#autotoc_md195 makes it easier to think about: you can see it compares the address and size with the values stored in TLS slots by the load. So if there's no load, those are likely to fail. But it does seem there is a chance the slot values could happen to match (the size could easily match if a prior pair set it; the address is much less likely but possible), in which case the unpaired store would succeed as though it had acquired the monitor. Looks like the test code has a clrex
before the unpaired stores, so that guarantees they will all fail, so this case is not in the test. So that is one issue: an unpaired store might succeed under DR but fail natively. Your experiment will say whether this matters for this app.
What was the result of the experiment?
No hangs in this case.
What was the result of the experiment?
No hangs in this case.
So your theory is that in some runs the app executes unpaired stores with addresses that happen to match a prior load-store pair and thus hits the case I described where the unpaired store succeeds under DR and that causes the app to somehow hang?
thus hits the case I described where the unpaired store succeeds something like this.
I patched DRIO again and prohibited splitting bb by the branch if it was inside monitor region. for example, bb looks like
TAG 0x0000fff9211aaa08
+0 L3 @0x0000fff9211acb78 d10083ff sub %sp $0x0020 lsl $0x00 -> %sp
+4 L3 @0x0000fff9211abf18 f9000fe0 str %x0 -> +0x18(%sp)[8byte]
+8 L3 @0x0000fff9211acbe0 f9000be1 str %x1 -> +0x10(%sp)[8byte]
+12 L3 @0x0000fff9211ab570 f90007e2 str %x2 -> +0x08(%sp)[8byte]
+16 L3 @0x0000fff9211ad290 f94007e1 ldr +0x08(%sp)[8byte] -> %x1
+20 L3 @0x0000fff9211ab2b0 f9400fe2 ldr +0x18(%sp)[8byte] -> %x2
+24 L3 @0x0000fff9211ab0c0 f9400be0 ldr +0x10(%sp)[8byte] -> %x0
+28 L3 @0x0000fff9211aaa08 c85f7c03 ldxr (%x0)[8byte] -> %x3
+32 L3 @0x0000fff9211ab318 eb01007f subs %x3 %x1 lsl $0x00 -> %xzr
+36 L3 @0x0000fff9211ad010 54000061 b.ne $0x0000ffffa26ba12c
+40 L3 @0x0000fff9211ab4a8 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+44 L3 @0x0000fff9211ad610 35ffff84 cbnz $0x0000ffffa26ba118 %w4
+48 L4 @0x0000fff9211ad6d0 14000000 b $0x0000ffffa26ba12c
END 0x0000fff9211aaa08
And again I've got fail when one thread mangle bb with load-store pair
COUNT = 141 tid = 83036 load_store = 1.
TAG 0x0000fff9211ac0b0
+0 L3 @0x0000fff9211ac0b0 c85f7c03 ldxr (%x0)[8byte] -> %x3
+4 L3 @0x0000fff9211abdd8 eb01007f subs %x3 %x1 lsl $0x00 -> %xzr
+8 L3 @0x0000fff9211ab7f0 54000061 b.ne $0x0000ffffa26ba12c
+12 L3 @0x0000fff9211ab4a8 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+16 L3 @0x0000fff9211ac030 35ffff84 cbnz $0x0000ffffa26ba118 %w4
+20 L4 @0x0000fff9211aac08 14000000 b $0x0000ffffa26ba12c
END 0x0000fff9211ac0b0
And got the only store bb in the next time
COUNT = 151 tid = 83036 load_store = -1 SKIP STORE
TAG 0x0000fff9211ab318
+0 L3 @0x0000fff9211ab318 c804fc02 stlxr %x2 -> (%x0)[8byte] %w4
+4 L3 @0x0000fff9211ad010 35ffff84 cbnz $0x0000ffffa26ba118 %w4
+8 L4 @0x0000fff9211ab570 14000000 b $0x0000ffffa26ba12c
END 0x0000fff9211ab318
It is not clear for me. All bb have load-store pairs but the last one just store. How is it possible?
In good runs there is no such bb with store only
full dump of bb-s java-full-fail.txt Kirill
Recheck my logs. Some good runs also includes such regions with store only but this is 1st bb for this threads or previous one includes another store and branch adresses
COUNT = 120 tid = 70076 load_store = 0 STORE
TAG 0x0000fff930d9e048
+0 L4 @0x0000000000000000 c8dffc43 ldar (%x2)[8byte] -> %x3
+4 m4 @0x0000000000000000 c8dffc43 <label>
+4 L3 @0x0000ffffb48d002c eb00007f subs %x3 %x0 lsl $0x00 -> %xzr
+8 L3 @0x0000ffffb48d0030 54000061 b.ne $0x0000ffffb48d003c
+12 L3 @0x0000ffffb48d0034 c8057c41 stxr %x1 -> (%x2)[8byte] %w5
+16 L3 @0x0000ffffb48d0038 35ffff85 cbnz $0x0000ffffb48d0028 %w5
+20 L4 @0x0000000000000000 14000000 b $0x0000ffffb48d003c
END 0x0000fff930d9e048
COUNT = 126 tid = 70076 load_store = -1 SKIP STORE
TAG 0x0000fff930f730c0
+0 L3 @0x0000ffffb36afee4 8803fc02 stlxr %w2 -> (%x0)[4byte] %w3
+4 L3 @0x0000ffffb36afee8 35ffffa3 cbnz $0x0000ffffb36afedc %w3
+8 L4 @0x0000000000000000 14000000 b $0x0000ffffb36afeec
END 0x0000fff930f730c0
Looks like hang happened when we have the same store like we have in previous bb
Hi, @derekbruening. Is it possible to have store without load in real application? or is this some dynamorio code modification? Maybe you have idea where could be issue? what could I look at? Thx, Kirill
Is it possible to have store without load in real application? or is this some dynamorio code modification? Maybe you have idea where could be issue? what could I look at? Thx, Kirill
I would first determine whether this is really a store without a load in the dynamic instruction stream. The static code could have 2 stores sharing one load, something like:
start:
ldxr
cmp
b.ne storeB
storeA:
stxr
jmp done
storeB:
stxr
done:
Just seeing DR build a block for storeB
doesn't mean that thread didn't just execute the block for start
which was built a long time ago by another thread.
Maybe the instrace_simple
client is the easiest way to answer this, getting a dynamic instruction trace. Or augment DR's mangling (or add a client that does this) to clear a TLS slot after a store to know whether a load was executed by looking for a non-cleared slot.
Also, this example from above:
+0 L4 @0x0000000000000000 c8dffc43 ldar (%x2)[8byte] -> %x3
+4 m4 @0x0000000000000000 c8dffc43 <label>
+4 L3 @0x0000ffffb48d002c eb00007f subs %x3 %x0 lsl $0x00 -> %xzr
+8 L3 @0x0000ffffb48d0030 54000061 b.ne $0x0000ffffb48d003c
+12 L3 @0x0000ffffb48d0034 c8057c41 stxr %x1 -> (%x2)[8byte] %w5
Looks wrong too: ldar
does not acquire a monitor, so the stxr
will always fail. It's just like a plain stxr
with no exclusive load in that sense. Is there an exclusive load (with an 'x' in the opcode) right before that ldar
?
Is there an exclusive load (with an 'x' in the opcode) right before that
ldar
?
Sure, this is ilist after we mangle ldaxr but it was before stxr mangling. original was like this
COUNT = 120 tid = 70076 load_store = 0 STORE
TAG 0x0000fff930d9e048
+0 L3 @0x0000ffffb48d002c8 c8dffc43 ldaxr (%x2)[8byte] -> %x3
+4 L3 @0x0000ffffb48d002c eb00007f subs %x3 %x0 lsl $0x00 -> %xzr
+8 L3 @0x0000ffffb48d0030 54000061 b.ne $0x0000ffffb48d003c
+12 L3 @0x0000ffffb48d0034 c8057c41 stxr %x1 -> (%x2)[8byte] %w5
+16 L3 @0x0000ffffb48d0038 35ffff85 cbnz $0x0000ffffb48d0028 %w5
+20 L4 @0x0000000000000000 14000000 b $0x0000ffffb48d003c
END 0x0000fff930d9e048
example when we have hang
TAG 0x0000fff91c885950
+0 L3 @0x0000ffff9f189fa0 88027e60 stxr %w0 -> (%x19)[4byte] %w2
+4 L3 @0x0000ffff9f189fa4 35ffff82 cbnz $0x0000ffff9f189f94 %w2
+8 L4 @0x0000000000000000 14000000 b $0x0000ffff9f189fa8
END 0x0000fff91c885950
oroginal code is pthread_mutex_lock
(gdb) x /64i (0x0000ffff9f189f70-16)
0xffff9f189f60 <pthread_mutex_lock>: add x3, x0, #0x10
0xffff9f189f64 <pthread_mutex_lock+4>: ldr w2, [x3]
0xffff9f189f68 <pthread_mutex_lock+8>: mov w1, #0x17f // #383
0xffff9f189f6c <pthread_mutex_lock+12>: and w1, w2, w1
0xffff9f189f70 <pthread_mutex_lock+16>: nop
0xffff9f189f74 <pthread_mutex_lock+20>: tst w2, #0x7c
0xffff9f189f78 <pthread_mutex_lock+24>: b.ne 0xffff9f18a038 <pthread_mutex_lock+216> // b.any
0xffff9f189f7c <pthread_mutex_lock+28>: stp x29, x30, [sp, #-32]!
0xffff9f189f80 <pthread_mutex_lock+32>: mov x29, sp
0xffff9f189f84 <pthread_mutex_lock+36>: stp x19, x20, [sp, #16]
0xffff9f189f88 <pthread_mutex_lock+40>: mov x19, x0
0xffff9f189f8c <pthread_mutex_lock+44>: cbnz w1, 0xffff9f189fe0 <pthread_mutex_lock+128>
0xffff9f189f90 <pthread_mutex_lock+48>: mov w0, #0x1 // #1
0xffff9f189f94 <pthread_mutex_lock+52>: ldaxr w1, [x19]
0xffff9f189f98 <pthread_mutex_lock+56>: cmp w1, #0x0
0xffff9f189f9c <pthread_mutex_lock+60>: b.ne 0xffff9f189fa8 <pthread_mutex_lock+72> // b.any
0xffff9f189fa0 <pthread_mutex_lock+64>: stxr w2, w0, [x19]
0xffff9f189fa4 <pthread_mutex_lock+68>: cbnz w2, 0xffff9f189f94 <pthread_mutex_lock+52>
0xffff9f189fa8 <pthread_mutex_lock+72>: b.ne 0xffff9f18a040 <pthread_mutex_lock+224> // b.any
0xffff9f189fac <pthread_mutex_lock+76>: ldr w0, [x19, #8]
0xffff9f189fb0 <pthread_mutex_lock+80>: cbnz w0, 0xffff9f18a180 <pthread_mutex_lock+544>
0xffff9f189fb4 <pthread_mutex_lock+84>: mrs x20, tpidr_el0
0xffff9f189fb8 <pthread_mutex_lock+88>: sub x20, x20, #0x800
0xffff9f189fbc <pthread_mutex_lock+92>: ldr w0, [x19, #12]
0xffff9f189fc0 <pthread_mutex_lock+96>: ldr w1, [x20, #464]
0xffff9f189fc4 <pthread_mutex_lock+100>: add w0, w0, #0x1
0xffff9f189fc8 <pthread_mutex_lock+104>: stp w1, w0, [x19, #8]
0xffff9f189fcc <pthread_mutex_lock+108>: nop
'good' bb looks like (I prohibitted branch splitting inside monitor region)
TAG 0x0000fff916774d08
+0 L3 @0x0000ffff9a4abcb4 52800020 movz $0x0001 lsl $0x00 -> %w0
+4 L3 @0x0000ffff9a4abcb8 885ffe61 ldaxr (%x19)[4byte] -> %w1
+8 L3 @0x0000ffff9a4abcbc 7100003f subs %w1 $0x0000 lsl $0x00 -> %wzr
+12 L3 @0x0000ffff9a4abcc0 54000061 b.ne $0x0000ffff9a4abccc
+16 L3 @0x0000ffff9a4abcc4 88027e60 stxr %w0 -> (%x19)[4byte] %w2
+20 L3 @0x0000ffff9a4abcc8 35ffff82 cbnz $0x0000ffff9a4abcb8 %w2
+24 L4 @0x0000000000000000 14000000 b $0x0000ffff9a4abccc
END 0x0000fff916774d08
I don't follow what the prior comment is trying to say: the pthread_mutex_lock code does not have a branch that skips the ldaxr
. It looks like it always executed the ldaxr
whenever it executes stxr
. So it should not matter whether the stxr
block is separate or not.
+0 L3 @0x0000ffff9a4abcb4 52800020 movz $0x0001 lsl $0x00 -> %w0 +4 L3 @0x0000ffff9a4abcb8 885ffe61 ldaxr (%x19)[4byte] -> %w1 +8 L3 @0x0000ffff9a4abcbc 7100003f subs %w1 $0x0000 lsl $0x00 -> %wzr +12 L3 @0x0000ffff9a4abcc0 54000061 b.ne $0x0000ffff9a4abccc
The problem is that I don't see the bb with ldaxr before stxr something like this is absent in logs.
+0 L3 @0x0000ffff9a4abcb4 52800020 movz $0x0001 lsl $0x00 -> %w0
+4 L3 @0x0000ffff9a4abcb8 885ffe61 ldaxr (%x19)[4byte] -> %w1
+8 L3 @0x0000ffff9a4abcbc 7100003f subs %w1 $0x0000 lsl $0x00 -> %wzr
+12 L3 @0x0000ffff9a4abcc0 54000061 b.ne $0x0000ffff9a4abccc
there is no mangling ldaxr (%x19)[4byte] -> %w1
The problem is that I don't see the bb with ldaxr before stxr something like this is absent in logs.
Are you sure it's not in some other thread's log or something. If you could find the branch that skips the ldaxr
-- record dynamic instruction trace or something. Or run without DR and put a breakpoint on both the ldaxr
and the stxr
and see whether the stxr
is ever reached w/o the ldaxr
-- unfortunately there is no LBR access but that would be a confirmation.
The problem is that I don't see the bb with ldaxr before stxr something like this is absent in logs.
Are you sure it's not in some other thread's log or something. If you could find the branch that skips the
ldaxr
-- record dynamic instruction trace or something. Or run without DR and put a breakpoint on both theldaxr
and thestxr
and see whether thestxr
is ever reached w/o theldaxr
-- unfortunately there is no LBR access but that would be a confirmation.
try to reproduce with debug logs to understand the fragment chain.
Still could not reproduce hang in debug. Just have log where we've got cut fragment with just store without load
no one fragment was linked with F204780(0x0000ffff9eba6e40)
d_r_dispatch: target = 0x0000ffff9eba8f58
Entry into F59790(0x0000ffff9eba8f58).0x0000ffff1b2cf740 (shared)
fcache_enter = 0x0000ffff1abf50c0, target = 0x0000ffff1b2cf73c
Exit from F1801(0x0000ffff9eba66a0).0x0000ffff1ac4875c (shared) (cannot link F1801->F108706) (cannot link shared to private)
d_r_dispatch: target = 0x0000ffff9eba6718
Entry into F108706(0x0000ffff9eba6718).0x0000ffff1b5a3104
fcache_enter = 0x0000ffff1abf50c0, target = 0x0000ffff1b5a3100
Exit from F108706(0x0000ffff9eba6718).0x0000ffff1b5a3130 (cannot link F108706->F26404) (cannot link shared to private)
d_r_dispatch: target = 0x0000ffff9eba673c
Entry into F26404(0x0000ffff9eba673c).0x0000ffff1af018b8 (shared)
fcache_enter = 0x0000ffff1abf50c0, target = 0x0000ffff1af018b4
Exit from F108860(0x0000ffff9eba8f38).0x0000ffff1b5a3334 (cannot link F108860->F59790) (cannot link shared to private)
d_r_dispatch: target = 0x0000ffff9eba8f58
Entry into F59790(0x0000ffff9eba8f58).0x0000ffff1b2cf740 (shared)
fcache_enter = 0x0000ffff1abf50c0, target = 0x0000ffff1b2cf73c
master_signal_handler: thread=47481, sig=12, xsp=0x0000fff91e6c9da0, retaddr=0x000000000000000c
siginfo: sig = 12, pid = 45807, status = 0, errno = 0, si_code = -6
building bb instrlist now *********************
interp: start_pc = 0x0000ffff9eba6e38
check_thread_vm_area: pc = 0x0000ffff9eba6e38
check_thread_vm_area: check_stop = 0x0000ffff9ebcf158
0x0000ffff9eba6e38 52800041 movz $0x0002 lsl $0x00 -> %w1
0x0000ffff9eba6e3c 885ffe60 ldaxr (%x19)[4byte] -> %w0
0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
end_pc = 0x0000ffff9eba6e48
setting cur_pc (for fall-through) to 0x0000ffff9eba6e48
forward_eflags_analysis: movz $0x0002 lsl $0x00 -> %w1
instr 0 => 0
forward_eflags_analysis: ldaxr (%x19)[4byte] -> %w0
instr 0 => 0
forward_eflags_analysis: stxr %w1 -> (%x19)[4byte] %w2
instr 0 => 0
Converting exclusive load @0x0000ffff9eba6e3c to regular
Using optimized same-block ldex-stex mangling
Converting exclusive store @0x0000ffff9eba6e40 to compare-and-swap
bb ilist after mangling:
TAG 0x0000ffff9eba6e38
+0 L3 @0x0000ffff9eba6e38 52800041 movz $0x0002 lsl $0x00 -> %w1
+4 L4 @0x0000ffff9eba6e3c 88dffe60 ldar (%x19)[4byte] -> %w0
+8 m4 @0x0000ffff9eba6e3c 88dffe60 <label>
+8 m4 @0x0000ffff9eba6e40 885ffe62 ldaxr (%x19)[4byte] -> %w2
+12 m4 @0x0000ffff9eba6e40 cb206042 sub %x2 %x0 uxtx $0x0000000000000000 -> %x2
+16 m4 @0x0000ffff9eba6e40 b5000002 cbnz @0x0000fff91e6ab9e8[8byte] %x2
+20 L3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+24 m4 @0x0000ffff9eba6e40 14000000 b @0x0000fff91e6abc80[8byte]
+28 m4 @0x0000ffff9eba6e40 14000000 <label>
+28 m4 @0x0000ffff9eba6e40 d5033f5f clrex $0x000000000000000f
+32 m3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+36 m4 @0x0000ffff9eba6e40 d5033f5f <label>
+36 L3 @0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
+40 L4 @0x0000ffff9eba6e44 14000000 b $0x0000ffff9eba6e48
END 0x0000ffff9eba6e38
done building bb instrlist *********************
building bb instrlist now *********************
interp: start_pc = 0x0000ffff9eba6e38
check_thread_vm_area: pc = 0x0000ffff9eba6e38
check_thread_vm_area: check_stop = 0x0000ffff9ebcf158
0x0000ffff9eba6e38 52800041 movz $0x0002 lsl $0x00 -> %w1
0x0000ffff9eba6e3c 885ffe60 ldaxr (%x19)[4byte] -> %w0
0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
end_pc = 0x0000ffff9eba6e48
setting cur_pc (for fall-through) to 0x0000ffff9eba6e48
forward_eflags_analysis: movz $0x0002 lsl $0x00 -> %w1
instr 0 => 0
forward_eflags_analysis: ldaxr (%x19)[4byte] -> %w0
instr 0 => 0
forward_eflags_analysis: stxr %w1 -> (%x19)[4byte] %w2
instr 0 => 0
Converting exclusive load @0x0000ffff9eba6e3c to regular
Using optimized same-block ldex-stex mangling
Converting exclusive store @0x0000ffff9eba6e40 to compare-and-swap
bb ilist after mangling:
TAG 0x0000ffff9eba6e38
+0 L3 @0x0000ffff9eba6e38 52800041 movz $0x0002 lsl $0x00 -> %w1
+4 L4 @0x0000ffff9eba6e3c 88dffe60 ldar (%x19)[4byte] -> %w0
+8 m4 @0x0000ffff9eba6e3c 88dffe60 <label>
+8 m4 @0x0000ffff9eba6e40 885ffe62 ldaxr (%x19)[4byte] -> %w2
+12 m4 @0x0000ffff9eba6e40 cb206042 sub %x2 %x0 uxtx $0x0000000000000000 -> %x2
+16 m4 @0x0000ffff9eba6e40 b5000002 cbnz @0x0000fff91e6ab418[8byte] %x2
+20 L3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+24 m4 @0x0000ffff9eba6e40 14000000 b @0x0000fff91e6aaca0[8byte]
+28 m4 @0x0000ffff9eba6e40 14000000 <label>
+28 m4 @0x0000ffff9eba6e40 d5033f5f clrex $0x000000000000000f
+32 m3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+36 m4 @0x0000ffff9eba6e40 d5033f5f <label>
+36 L3 @0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
+40 L4 @0x0000ffff9eba6e44 14000000 b $0x0000ffff9eba6e48
END 0x0000ffff9eba6e38
done building bb instrlist *********************
Exit due to proactive reset
d_r_dispatch: target = 0x0000ffff9eba6e40
build_basic_block_fragment !!!!!!!!!!!!!!!!!!
interp: start_pc = 0x0000ffff9eba6e40
check_thread_vm_area: pc = 0x0000ffff9eba6e40
check_thread_vm_area: check_stop = 0x0000ffff9ebcf158
0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
end_pc = 0x0000ffff9eba6e48
Converting exclusive store @0x0000ffff9eba6e40 to compare-and-swap
bb ilist after mangling:
TAG 0x0000ffff9eba6e40
+0 m4 @0x0000000000000000 f9000380 str %x0 -> (%x28)[8byte]
+4 m4 @0x0000000000000000 f9405780 ldr +0xa8(%x28)[8byte] -> %x0
+8 m4 @0x0000000000000000 cb206262 sub %x19 %x0 uxtx $0x0000000000000000 -> %x2
+12 m4 @0x0000000000000000 b5000002 cbnz @0x0000fff91e6ab920[8byte] %x2
+16 m4 @0x0000000000000000 f9406380 ldr +0xc0(%x28)[8byte] -> %x0
+20 m4 @0x0000000000000000 d1001002 sub %x0 $0x0000000000000004 lsl $0x0000000000000000 -> %x2
+24 m4 @0x0000000000000000 b5000002 cbnz @0x0000fff91e6ab920[8byte] %x2
+28 m4 @0x0000000000000000 f9405b80 ldr +0xb0(%x28)[8byte] -> %x0
+32 m4 @0x0000000000000000 885ffe62 ldaxr (%x19)[4byte] -> %w2
+36 m4 @0x0000000000000000 cb206042 sub %x2 %x0 uxtx $0x0000000000000000 -> %x2
+40 m4 @0x0000000000000000 b5000002 cbnz @0x0000fff91e6ab920[8byte] %x2
+44 L3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+48 m4 @0x0000000000000000 14000000 b @0x0000fff91e6ab7a0[8byte]
+52 m4 @0x0000000000000000 14000000 <label>
+52 m4 @0x0000000000000000 d5033f5f clrex $0x000000000000000f
+56 m3 @0x0000ffff9eba6e40 88027e61 stxr %w1 -> (%x19)[4byte] %w2
+60 m4 @0x0000000000000000 d5033f5f <label>
+60 m4 @0x0000000000000000 f9400380 ldr (%x28)[8byte] -> %x0
+64 L3 @0x0000ffff9eba6e44 35ffffc2 cbnz $0x0000ffff9eba6e3c %w2
+68 L4 @0x0000000000000000 14000000 b $0x0000ffff9eba6e48
END 0x0000ffff9eba6e40
linking new fragment F204780(0x0000ffff9eba6e40)
linking incoming links for F204780(0x0000ffff9eba6e40)
linking outgoing links for F204780(0x0000ffff9eba6e40)
linking F204780(0x0000ffff9eba6e40).0x0000ffff1bcac8d4 -> F136578(0x0000ffff9eba6e3c)=0x0000ffff1b755c34
add incoming F204780(0x0000ffff9eba6e40).0x0000ffff1bcac8d4 -> F136578(0x0000ffff9eba6e3c)
linking F204780(0x0000ffff9eba6e40).0x0000ffff1bcac8d8 -> F26394(0x0000ffff9eba6e48)=0x0000ffff1af014fc
add incoming F204780(0x0000ffff9eba6e40).0x0000ffff1bcac8d8 -> F26394(0x0000ffff9eba6e48)
Entry into F204780(0x0000ffff9eba6e40).0x0000ffff1bcac894 (shared)
fcache_enter = 0x0000ffff1abf50c0, target = 0x0000ffff1bcac890
Exit from F1801(0x0000ffff9eba66a0).0x0000ffff1ac4875c (shared) (cannot link F1801->F108706) (cannot link shared to private)
run the same workload under gdb with breakpoints in pthrea_mutex_lock monitor region
try to catch 1000001 times. all stores had loads.
Num Type Disp Enb Address What
2 breakpoint keep y 0x0000ffffbf67df94 <pthread_mutex_lock+52>
breakpoint already hit 1000001 times
3 breakpoint keep y 0x0000ffffbf67dfa0 <pthread_mutex_lock+64>
breakpoint already hit 1000001 times
Not reproducing in debug reminds me of some tests that are hanging in release on AArch64 but never hang in debug: #4928, e.g. We were going to try to figure that out soon; maybe we can get lucky and it will be the same underlying problem as here.
For your logs in https://github.com/DynamoRIO/dynamorio/issues/3733#issuecomment-1041661068 the explanation is this line:
Exit due to proactive reset
So DR suspended a thread in between the ldaxr and stxr and redirected it to start executing at a new block that tail-duplicates the original. So dynamically there was a ldaxr before the stxr; DR just made a split block for the suspend-and-relocate.
Not reproducing in debug reminds me of some tests that are hanging in release on AArch64 but never hang in debug: #4928, e.g. We were going to try to figure that out soon; maybe we can get lucky and it will be the same underlying problem as here.
Hi, @derekbruening. this patch https://github.com/DynamoRIO/dynamorio/pull/5367/commits/3c846dafa2cf5a7a0d7cf4c7fa88a6e097c7016e?
Not reproducing in debug reminds me of some tests that are hanging in release on AArch64 but never hang in debug: #4928, e.g. We were going to try to figure that out soon; maybe we can get lucky and it will be the same underlying problem as here.
Hi, @derekbruening. this patch 3c846da?
Yes PR #5367 fixes one hang we found that reproduced in release build but not debug (just b/c of timing). There are more though: drcachesim online (#4928) and there are some code-inspection issues #2502. Still, it is worth trying with the PR #5367 patch that was just merged to see if that helps these Java apps.
Remove all my workarounds (prohibition splitting inside monitor region nad so on) and apply this patch. Had one hang on 2000 runs. Previous was about 2-3 hangs on 100 time run. Kirill
Sounds like progress. There's also PR #5370 and PR #5375.
Sounds like progress. There's also PR #5370 and PR #5375.
These patches didn't help better, the same hang frequency
So DR suspended a thread in between the ldaxr and stxr and redirected it to start executing at a new block that tail-duplicates the original. So dynamically there was a ldaxr before the stxr; DR just made a split block for the suspend-and-relocate.
Hi. Still haunts the question here. If it was so and we just suspended thread between load and store, we should have the same counters in rstats statistics but thay are different in many runs
Load-exclusive instrs converted to CAS : 56721
Store-exclusive instrs converted to CAS : 56686
Kirill
Hi @derekbruening We have SIGSEGV crash case on AArch64 again
java report
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000ffff93f12c4c, pid=630238, tid=0x0000fff1010621e0
#
# JRE version: OpenJDK Runtime Environment (8.0) (build 1.8.0-internal)
# Java VM: OpenJDK 64-Bit Server VM (25.71-b00 mixed mode linux-aarch64 )
# Problematic frame:
# V [libjvm.so+0x562c4c] PhaseChaitin::build_ifg_physical(ResourceArea*)+0x42c
#
# Failed to write core dump..
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
crash context
Registers:
R0=0x0000fff08c1b10b0
R1=0x0000fff08c1a2f40
R2=0xffffffff945d8290
R3=0x0000000000000007
R4=0x0000000000000002
R5=0x0000000000000000
R6=0x0000ffff946add20
R7=0x0000000000000000
R8=0x0000fff1010628e0
R9=0x0000000000000000
R10=0x00000000ffffffff
R11=0x0000000000000000
R12=0x0000000000000000
R13=0x0000000000000000
R14=0x0000000000000000
R15=0x0000000000000000
R16=0x0000000000000000
R17=0x0000000000000000
R18=0x0000000000000000
R19=0x0000fff08c1b0f90
R20=0x00000000000f423f
R21=0x0000fff08c03efb0
R22=0x0000fff10105efb0
R23=0x0000000000000018
R24=0x0000000000000068
R25=0x0000000000000000
R26=0x0000fff10105efb0
R27=0x0000ffff946add20
R28=0x0000000000000001
R29=0x0000fff10105e9d0
R30=0x0000ffff93f12b8c
The same SIGSEGV in DynamoRIO logs
computing memory target for 0x0000ffff115a5e8c causing SIGSEGV, kernel claims it is 0x0000ffeedd903980
compute_memory_target: falling back to racy protection checks
opnd_compute_address for: (%x1,%x2,lsl #2)
base => 0x0000fff08c1a2f40
index,scale => 0x0000ffeedd903980
disp => 0x0000ffeedd903980
For SIGSEGV at cache pc 0x0000ffff115a5e8c, computed target read 0x0000ffeedd903980
faulting instr: ldr (%x1,%x2,lsl #2)[4byte] -> %w25
** Received SIGSEGV at cache pc 0x0000ffff115a5e8c in thread 630480
register context is the same like java reported
$10 = {uc_flags = 0x0, uc_link = 0x0, uc_stack = {ss_sp = 0x0, ss_flags = 0x2, ss_size = 0x0}, uc_sigmask = {__val = {0x4, 0xabababababababab <repeats 15 times>}},
uc_mcontext = {fault_address = 0xffeedd903980, regs = {0xfff08c1b10b0, 0xfff08c1a2f40, 0xffffffff945d8290, 0x7, 0x2, 0x0, 0xffff946add20, 0x0, 0xfff1010628e0, 0x0,
0xffffffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xfff08c1b0f90, 0xf423f, 0xfff08c03efb0, 0xfff10105efb0, 0x18, 0x68, 0x0, 0xfff10105efb0, 0xffff946add20, 0x1,
0xfff10105e9d0, 0xffff93f12b8c}, sp = 0xfff10105e9d0, pc = 0xffff93f12c4c, pstate = 0x80000000, __reserved = {0x1, 0x80, 0x50, 0x46, 0x10, 0x2, 0x0, 0x0, 0x10, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x40, 0x6e, 0xe9, 0xea, 0x3f, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x68, 0x44, 0x55, 0x1c, 0xe6, 0x93, 0x12, 0x40,
0x0 <repeats 31 times>, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x88, 0xf2, 0x1a, 0x8c, 0xf0, 0xff, 0x0, 0x0, 0x80, 0x4d, 0x1, 0x8c, 0xf0, 0xff, 0x0, 0x0, 0x10,
0x4d, 0x1, 0x8c, 0xf0, 0xff, 0x0, 0x0, 0xa8, 0x4d, 0x29, 0x8c, 0xf0, 0xff, 0x0, 0x0, 0xb8, 0x39, 0x5, 0x80, 0xf0, 0xff, 0x0, 0x0, 0x38, 0x37, 0x5, 0x80, 0xf0, 0xff,
0x0, 0x0, 0x2, 0x8, 0x20, 0x80, 0x2, 0x8, 0x20, 0x80, 0x2, 0x8, 0x20, 0x80, 0x2, 0x8, 0x20, 0x80, 0x0, 0x0, 0x0, 0x40, 0x6e, 0xe9, 0xea, 0x3f,
0x0 <repeats 120 times>, 0x1, 0x4, 0x10, 0x40, 0x1, 0x4, 0x10, 0x40, 0x1, 0x4, 0x10, 0x40, 0x1, 0x4, 0x10, 0x40, 0x10, 0x0, 0xaa, 0xaa, 0x41, 0x0, 0x0, 0x10, 0x1,
0x40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x1, 0x0, 0x0, 0x40, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10, 0x0, 0x0, 0x3, 0x0 <repeats 14 times>...}}}
Basic Block and mangling bb
d_r_dispatch: target = 0x0000ffff93f12c3c
interp: start_pc = 0x0000ffff93f12c3c
check_thread_vm_area: pc = 0x0000ffff93f12c3c
check_thread_vm_area: check_stop = 0x0000ffff946d6408
0x0000ffff93f12c3c f9400660 ldr +0x08(%x19)[8byte] -> %x0
0x0000ffff93f12c40 f94096c1 ldr +0x0128(%x22)[8byte] -> %x1
0x0000ffff93f12c44 f87c5800 ldr (%x0,%w28,uxtw #3)[8byte] -> %x0
0x0000ffff93f12c48 b9802802 ldrsw +0x28(%x0)[4byte] -> %x2
0x0000ffff93f12c4c b8627839 ldr (%x1,%x2,lsl #2)[4byte] -> %w25 <<<============ CRASH
0x0000ffff93f12c50 34ffff19 cbz $0x0000ffff93f12c30 %w25
end_pc = 0x0000ffff93f12c54
skip save stolen reg app value for: ldr (%x0,%w28,uxtw #3)[8byte] -> %x0
bb ilist after mangling:
TAG 0x0000ffff93f12c3c
+0 L3 @0x0000fff9116d9538 f9400660 ldr +0x08(%x19)[8byte] -> %x0
+4 L3 @0x0000fff91150c4d8 f94096c1 ldr +0x0128(%x22)[8byte] -> %x1
+8 m4 @0x0000fff9116d9128 f9000781 str %x1 -> +0x08(%x28)[8byte]
+12 m4 @0x0000fff9116d9438 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+16 m4 @0x0000fff91150cdc0 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+20 L3 @0x0000fff91150c290 f87c5800 ldr (%x0,%w28,uxtw #3)[8byte] -> %x0
+24 m4 @0x0000fff9116d6458 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
+28 m4 @0x0000fff9116d6060 f9400781 ldr +0x08(%x28)[8byte] -> %x1
+32 L3 @0x0000fff9116d9028 b9802802 ldrsw +0x28(%x0)[4byte] -> %x2
+36 L3 @0x0000fff91150c930 b8627839 ldr (%x1,%x2,lsl #2)[4byte] -> %w25 <<<============ CRASH
+40 L3 @0x0000fff91150cf08 34ffff19 cbz $0x0000ffff93f12c30 %w25
+44 L4 @0x0000fff9116d91f0 14000000 b $0x0000ffff93f12c54
END 0x0000ffff93f12c3c
Look at crashes instruction
+36 L3 @0x0000fff91150c930 b8627839 ldr (%x1,%x2,lsl #2)[4byte] -> %w25 <<<============ CRASH
We got fault_address=0xffeedd903980
if we use register context x2=0xffffffff945d8290
; x1=0xfff08c1a2f40
(gdb) p /x 0xffffffff945d8290<<2
$11 = 0xfffffffe51760a40
(gdb) p /x (0xfff08c1a2f40+0xfffffffe51760a40)
$12 = 0xffeedd903980
BUT let's look at the previous instruction
0x0000ffff93f12c48 b9802802 ldrsw +0x28(%x0)[4byte] -> %x2
x0=0xfff08c1b10b0
(gdb) x /gx (0xfff08c1b10b0+0x28)
0xfff08c1b10d8: 0x0000ffff945d8290
So, ldrsw instruction must set x2=0x945d8290
. it should not be x2=0xffffffff945d8290
the CRASH instruction is ok if x2=0x945d8290
(gdb) p /x 0x945d8290<<2
$16 = 0x51760a40
(gdb) p /x (0xfff08c1a2f40+0x51760a40)
$17 = 0xfff0dd903980
(gdb) x /gx (0xfff08c1a2f40+0x51760a40)
0xfff0dd903980: 0x0000000000000000
Does DRIO make some internal job here? What could be wrong? I could not catch why x2 register is incorrect. Thanks, Kirill
So, ldrsw instruction must set
x2=0x945d8290
. it should not bex2=0xffffffff945d8290
oh, ldrsw is signed, so x2 could be 0xffffffff945d8290 continue investigation what could be wrong here.
Hi @derekbruening. Could you help me to understand the crash? We've got synchro signal on thread.
main_signal_handler: thread=1588266, sig=12, xsp=0x0000fff923c94da0, retaddr=0x000000000000000c
siginfo: sig = 12, pid = 1587929, status = 0, errno = 0, si_code = -6
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000fff923c6e000
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff238c68c8
pstate = 0x0000000020000000
pc is 0x0000ffff238c68c8
code cache for the bb looks like
(gdb) x /16i (0x0000ffff238c68c8-48)
0xffff238c6898: ldr x0, [x25, #8]
0xffff238c689c: str x0, [x28]
0xffff238c68a0: mov x0, x28
0xffff238c68a4: ldr x28, [x28, #48]
0xffff238c68a8: lsl x27, x28, #3
0xffff238c68ac: mov x28, x0
0xffff238c68b0: ldr x0, [x28]
0xffff238c68b4: str x1, [x28, #8]
0xffff238c68b8: mov x1, x28
0xffff238c68bc: ldr x28, [x28, #48]
0xffff238c68c0: ldr x0, [x0, x28, lsl #3]
0xffff238c68c4: mov x28, x1
==> 0xffff238c68c8: ldr x1, [x28, #8] <==
0xffff238c68cc: cmp x20, x0
0xffff238c68d0: b.eq 0xffff238c6de8 // b.none
0xffff238c68d4: b 0xffff238c6a68
clear bb and bb after mangling
interp: start_pc = 0x0000ffffa6350424
check_thread_vm_area: pc = 0x0000ffffa6350424
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
0x0000ffffa6350424 f9400720 ldr +0x08(%x25)[8byte] -> %x0
0x0000ffffa6350428 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
bb ilist after mangling:
TAG 0x0000ffffa6350424
+0 L3 @0x0000fff923eafda0 f9400720 ldr +0x08(%x25)[8byte] -> %x0
+4 m4 @0x0000fff923eb1110 f9000380 str %x0 -> (%x28)[8byte]
+8 m4 @0x0000fff923eb1df0 aa1c03e0 orr %xzr %x28 lsl $0x0000000000000000 -> %x0
+12 m4 @0x0000fff923eb1358 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+16 L3 @0x0000fff923eb0430 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
+20 m4 @0x0000fff923eb1090 aa0003fc orr %xzr %x0 lsl $0x0000000000000000 -> %x28
+24 m4 @0x0000fff923eae950 f9400380 ldr (%x28)[8byte] -> %x0
+28 m4 @0x0000fff923eaf438 f9000781 str %x1 -> +0x08(%x28)[8byte]
+32 m4 @0x0000fff923eae9d0 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+36 m4 @0x0000fff923eaeb18 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+40 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
+44 m4 @0x0000fff923eafd20 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
==> +48 m4 @0x0000fff923eb00e8 f9400781 ldr +0x08(%x28)[8byte] -> %x1 <==
+52 L3 @0x0000fff923eb12d8 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+56 L3 @0x0000fff923eb20b8 54000340 b.eq $0x0000ffffa635049c
+60 L4 @0x0000fff923eb1e70 14000000 b $0x0000ffffa6350438
END 0x0000ffffa6350424
So, pc 0x0000ffff238c68c8
is mangling m4 instruction ldr +0x08(%x28)[8byte] -> %x1
When the thread was awake, dispatcher set target 0x0000ffffa635042c
handle_suspend_signal: awake now
main_signal_handler 12 returning now to 0x0000ffff22d11454
Exit due to proactive reset
d_r_dispatch: target = 0x0000ffffa635042c
Building new bb
interp: start_pc = 0x0000ffffa635042c
check_thread_vm_area: pc = 0x0000ffffa635042c
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
==> 0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0 <==
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
bb ilist after mangling:
TAG 0x0000ffffa635042c
+0 m4 @0x0000fff923eb1110 f9000781 str %x1 -> +0x08(%x28)[8byte]
+4 m4 @0x0000fff923eb12d8 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+8 m4 @0x0000fff923eb1df0 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
==> +12 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0 <==
+16 m4 @0x0000fff923eb1358 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
*** +20 m4 @0x0000fff923eb0430 f9400781 ldr +0x08(%x28)[8byte] -> %x1 ***
+24 L3 @0x0000fff923eafd20 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+28 L3 @0x0000fff923eb00e8 54000340 b.eq $0x0000ffffa635049c
+32 L4 @0x0000fff923eafda0 14000000 b $0x0000ffffa6350438
END 0x0000ffffa635042c
Looks like we back to 1st original instruction ldr (%x0,%x28,lsl #3)[8byte] -> %x0
before our mangle ldr +0x08(%x28)[8byte] -> %x1
but probably register context was not restored and x0 register has incorrect value
crash signal context
main_signal_handler: thread=1588266, sig=11, xsp=0x0000fff923c94da0, retaddr=0x000000000000000b
siginfo: sig = 11, pid = 264, status = 0, errno = 0, si_code = 1
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000000000000021
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff2417046c
pstate = 0x0000000020000000
computing memory target for 0x0000ffff2417046c causing SIGSEGV, kernel claims it is 0x0000000000000108
compute_memory_target: falling back to racy protection checks
opnd_compute_address for: (%x0,%x28,lsl #3)
base => 0x0000000000000000
index,scale => 0x0000000000000108
disp => 0x0000000000000108
For SIGSEGV at cache pc 0x0000ffff2417046c, computed target read 0x0000000000000108
faulting instr: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
** Received SIGSEGV at cache pc 0x0000ffff2417046c in thread 1588266
record_pending_signal(11) from cache pc 0x0000ffff2417046c
not certain can delay so handling now
action is not SIG_IGN
(gdb) x /9i (0x0000ffff2417046c-12)
0xffff24170460: str x1, [x28, #8]
0xffff24170464: mov x1, x28
0xffff24170468: ldr x28, [x28, #48]
0xffff2417046c: ldr x0, [x0, x28, lsl #3]
0xffff24170470: mov x28, x1
0xffff24170474: ldr x1, [x28, #8]
0xffff24170478: cmp x20, x0
0xffff2417047c: b.eq 0xffff24170484 // b.none
0xffff24170480: b 0xffff238c6a68
Do I understand correctly the following:
ldr x0, [x0, x28, lsl #3]
and change x0ldr x1, [x28, #8]
to ldr x0, [x0, x28, lsl #3]
but don't restore contextldr x0, [x0, x28, lsl #3]
the 2nd time but with incorrect register contextis it possible that we could not restore context? or am I wrong here? Thanks, Kirill
You would expect this to be marked as a mangling epilogue. Translation in a mangling epilogue is supposed to target the next instruction and "emulate" the rest of the epilogue, as it is sometimes impossible to undo the app instr and thus returning the being-mangled instr PC for restart is not feasible. This makes it look like that is not done correctly for stolen register mangling on AArch64. I would suggest filing a separate issue to focus on this.
I would suggest filing a separate issue to focus on this. ok - #5426
It will be great to have some workaround here. this crash is reproduced too often on the pool of our jvm workloads. (( Kirill
Hi, @derekbruening.
One more question here.
Before handle_suspend_signal: suspended now
and handle_suspend_signal: awake now
DynamoRIO calls recreate_bb_ilist
procedure 2 times and recreate original signaled basic block.
Why does it build it? Looks like it doesn't use it after signal because it recreates cutted bb from the last original app instruction
Thanks, Kirill
log example
main_signal_handler: thread=1588266, sig=12, xsp=0x0000fff923c94da0, retaddr=0x000000000000000c
siginfo: sig = 12, pid = 1587929, status = 0, errno = 0, si_code = -6
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000fff923c6e000
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff238c68c8
pstate = 0x0000000020000000
dcontext next tag = 0x0000ffff240d3d8c
handle_suspend_signal: suspended now
building bb instrlist now *********************
interp: start_pc = 0x0000ffffa6350424
check_thread_vm_area: pc = 0x0000ffffa6350424
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
0x0000ffffa6350424 f9400720 ldr +0x08(%x25)[8byte] -> %x0
0x0000ffffa6350428 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
setting cur_pc (for fall-through) to 0x0000ffffa6350438
forward_eflags_analysis: ldr +0x08(%x25)[8byte] -> %x0
instr 0 => 0
forward_eflags_analysis: ubfm %x28 $0x3d $0x3c -> %x27
instr 0 => 0
forward_eflags_analysis: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
instr 0 => 0
forward_eflags_analysis: subs %x20 %x0 lsl $0x00 -> %xzr
instr 3c0 => 0
skip save stolen reg app value for: ubfm %x28 $0x3d $0x3c -> %x27
skip save stolen reg app value for: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
bb ilist after mangling:
TAG 0x0000ffffa6350424
+0 L3 @0x0000fff923eafda0 f9400720 ldr +0x08(%x25)[8byte] -> %x0
+4 m4 @0x0000fff923eb1110 f9000380 str %x0 -> (%x28)[8byte]
+8 m4 @0x0000fff923eb1df0 aa1c03e0 orr %xzr %x28 lsl $0x0000000000000000 -> %x0
+12 m4 @0x0000fff923eb1358 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+16 L3 @0x0000fff923eb0430 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
+20 m4 @0x0000fff923eb1090 aa0003fc orr %xzr %x0 lsl $0x0000000000000000 -> %x28
+24 m4 @0x0000fff923eae950 f9400380 ldr (%x28)[8byte] -> %x0
+28 m4 @0x0000fff923eaf438 f9000781 str %x1 -> +0x08(%x28)[8byte]
+32 m4 @0x0000fff923eae9d0 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+36 m4 @0x0000fff923eaeb18 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+40 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
+44 m4 @0x0000fff923eafd20 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
+48 m4 @0x0000fff923eb00e8 f9400781 ldr +0x08(%x28)[8byte] -> %x1
+52 L3 @0x0000fff923eb12d8 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+56 L3 @0x0000fff923eb20b8 54000340 b.eq $0x0000ffffa635049c
+60 L4 @0x0000fff923eb1e70 14000000 b $0x0000ffffa6350438
END 0x0000ffffa6350424
done building bb instrlist *********************
building bb instrlist now *********************
interp: start_pc = 0x0000ffffa6350424
check_thread_vm_area: pc = 0x0000ffffa6350424
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
0x0000ffffa6350424 f9400720 ldr +0x08(%x25)[8byte] -> %x0
0x0000ffffa6350428 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
setting cur_pc (for fall-through) to 0x0000ffffa6350438
forward_eflags_analysis: ldr +0x08(%x25)[8byte] -> %x0
instr 0 => 0
forward_eflags_analysis: ubfm %x28 $0x3d $0x3c -> %x27
instr 0 => 0
forward_eflags_analysis: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
instr 0 => 0
forward_eflags_analysis: subs %x20 %x0 lsl $0x00 -> %xzr
instr 3c0 => 0
skip save stolen reg app value for: ubfm %x28 $0x3d $0x3c -> %x27
skip save stolen reg app value for: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
bb ilist after mangling:
TAG 0x0000ffffa6350424
+0 L3 @0x0000fff923eb1e70 f9400720 ldr +0x08(%x25)[8byte] -> %x0
+4 m4 @0x0000fff923eaeb18 f9000380 str %x0 -> (%x28)[8byte]
+8 m4 @0x0000fff923eae9d0 aa1c03e0 orr %xzr %x28 lsl $0x0000000000000000 -> %x0
+12 m4 @0x0000fff923eaf438 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+16 L3 @0x0000fff923eb20b8 d37df39b ubfm %x28 $0x3d $0x3c -> %x27
+20 m4 @0x0000fff923eae950 aa0003fc orr %xzr %x0 lsl $0x0000000000000000 -> %x28
+24 m4 @0x0000fff923eb1090 f9400380 ldr (%x28)[8byte] -> %x0
+28 m4 @0x0000fff923eb0430 f9000781 str %x1 -> +0x08(%x28)[8byte]
+32 m4 @0x0000fff923eb1358 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+36 m4 @0x0000fff923eb1df0 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+40 L3 @0x0000fff923eb12d8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
+44 m4 @0x0000fff923eb1110 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
+48 m4 @0x0000fff923eafda0 f9400781 ldr +0x08(%x28)[8byte] -> %x1
+52 L3 @0x0000fff923eb00e8 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+56 L3 @0x0000fff923eafd20 54000340 b.eq $0x0000ffffa635049c
+60 L4 @0x0000fff923eaf0a8 14000000 b $0x0000ffffa6350438
END 0x0000ffffa6350424
done building bb instrlist *********************
handle_suspend_signal: awake now
main_signal_handler 12 returning now to 0x0000ffff22d11454
Exit due to proactive reset
d_r_dispatch: target = 0x0000ffffa635042c
interp: start_pc = 0x0000ffffa635042c
check_thread_vm_area: pc = 0x0000ffffa635042c
check_thread_vm_area: check_stop = 0x0000ffffa6b02888
0x0000ffffa635042c f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
0x0000ffffa6350430 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
0x0000ffffa6350434 54000340 b.eq $0x0000ffffa635049c
end_pc = 0x0000ffffa6350438
skip save stolen reg app value for: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
bb ilist after mangling:
TAG 0x0000ffffa635042c
+0 m4 @0x0000fff923eb1110 f9000781 str %x1 -> +0x08(%x28)[8byte]
+4 m4 @0x0000fff923eb12d8 aa1c03e1 orr %xzr %x28 lsl $0x0000000000000000 -> %x1
+8 m4 @0x0000fff923eb1df0 f9401b9c ldr +0x30(%x28)[8byte] -> %x28
+12 L3 @0x0000fff923eaf0a8 f87c7800 ldr (%x0,%x28,lsl #3)[8byte] -> %x0
+16 m4 @0x0000fff923eb1358 aa0103fc orr %xzr %x1 lsl $0x0000000000000000 -> %x28
+20 m4 @0x0000fff923eb0430 f9400781 ldr +0x08(%x28)[8byte] -> %x1
+24 L3 @0x0000fff923eafd20 eb00029f subs %x20 %x0 lsl $0x00 -> %xzr
+28 L3 @0x0000fff923eb00e8 54000340 b.eq $0x0000ffffa635049c
+32 L4 @0x0000fff923eafda0 14000000 b $0x0000ffffa6350438
END 0x0000ffffa635042c
linking new fragment F559721(0x0000ffffa635042c)
linking incoming links for F559721(0x0000ffffa635042c)
linking outgoing links for F559721(0x0000ffffa635042c)
linking F559721(0x0000ffffa635042c).0x0000ffff2417047c -> F127689(0x0000ffffa635049c)=0x0000ffff238c6de8
add incoming F559721(0x0000ffffa635042c).0x0000ffff2417047c -> F127689(0x0000ffffa635049c)
linking F559721(0x0000ffffa635042c).0x0000ffff24170480 -> F127682(0x0000ffffa6350438)=0x0000ffff238c6a68
add incoming F559721(0x0000ffffa635042c).0x0000ffff24170480 -> F127682(0x0000ffffa6350438)
priv_mcontext_t @0x0000fff92338d880
r0 = 0x0000000000000000
r1 = 0x000000000000000b
r2 = 0x000000000000000c
r3 = 0x0000000000000030
r4 = 0x000000000000005c
r5 = 0x0000000000003c05
r6 = 0x0000fff09015a9b8
r7 = 0xfefeff6f6071735e
r8 = 0x7f7f7f7f7f7f7f7f
r9 = 0x0000000000000000
r10 = 0x0101010101010101
r11 = 0x0000000000000028
r12 = 0x0000a701409d1276
r13 = 0x0000000000000040
r14 = 0x000000000000003f
r15 = 0x0000000000000000
r16 = 0x0000ffffa651dc00
r17 = 0x0000ffffa6bc4080
r18 = 0x0000000000000000
r19 = 0x0000000000000030
r20 = 0x0000fff090484208
r21 = 0x0000fff0901b94f8
r22 = 0x0000ffffa6600340
r23 = 0x0000000000000001
r24 = 0x0000000000000021
r25 = 0x0000fff09045a8f8
r26 = 0x0000000000000021
r27 = 0x0000000000000108
r28 = 0x0000000000000021
r29 = 0x0000fff106572880
r30 = 0x0000ffffa635068c
r31 = 0x0000fff106572880
q0 = 0xabababab abababab abababab abababab
q1 = 0x901b8f28 0000fff0 901b8f28 0000fff0
q2 = 0x9045a8f8 0000fff0 9045a8f8 0000fff0
q3 = 0x00000000 00000000 00000000 00000000
q4 = 0x00000000 00000000 00000000 00000000
q5 = 0x94683a38 0000fff0 94684d60 0000fff0
q6 = 0x00000000 00000000 00000000 00000000
q7 = 0x40100401 40100401 40100401 40100401
q8 = 0x00000000 00000000 00000000 00000000
q9 = 0x00000000 00000000 00000000 00000000
q10 = 0x00000000 00000000 00000000 00000000
q11 = 0x00000000 00000000 00000000 00000000
q12 = 0x00000000 00000000 00000000 00000000
q13 = 0x00000000 00000000 00000000 00000000
q14 = 0x00000000 00000000 00000000 00000000
q15 = 0x00000000 00000000 00000000 00000000
q16 = 0x01005555 00005040 01005555 00005040
q17 = 0x10000000 aa800010 00001000 a00a8000
q18 = 0x00100000 00000000 80000000 80200802
q19 = 0x00000300 00000000 00000000 00000000
q20 = 0x11111111 01111111 00000000 00000000
q21 = 0x00000000 10000000 00000000 00000000
q22 = 0x00000000 0c000000 00000000 00000000
q23 = 0x00000000 03000000 00000000 00000000
q24 = 0x00000000 00c00000 00000000 00000000
q25 = 0x00000000 00300000 00000000 00000000
q26 = 0x00000000 000c0000 00000000 00000000
q27 = 0x0c000000 00000000 00000000 00000000
q28 = 0x30000000 00000000 00000000 00000000
q29 = 0x0000000c 00000000 00000000 00000000
q30 = 0x03000000 00000000 00000000 00000000
q31 = 0x55555555 00015555 00000000 00000000
eflags = 0x0000000020000000
pc = 0x0000ffff240d3d8c
Entry into F559721(0x0000ffffa635042c).0x0000ffff24170460 (shared)
fcache_enter = 0x0000ffff22d10b80, target = 0x0000ffff2417045c
main_signal_handler: thread=1588266, sig=11, xsp=0x0000fff923c94da0, retaddr=0x000000000000000b
siginfo: sig = 11, pid = 264, status = 0, errno = 0, si_code = 1
x0 = 0x0000000000000000
x1 = 0x0000fff923c6e000
x2 = 0x000000000000000c
x3 = 0x0000000000000030
x4 = 0x000000000000005c
x5 = 0x0000000000003c05
x6 = 0x0000fff09015a9b8
x7 = 0xfefeff6f6071735e
x8 = 0x7f7f7f7f7f7f7f7f
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x0000a701409d1276
x13 = 0x0000000000000040
x14 = 0x000000000000003f
x15 = 0x0000000000000000
x16 = 0x0000ffffa651dc00
x17 = 0x0000ffffa6bc4080
x18 = 0x0000000000000000
x19 = 0x0000000000000030
x20 = 0x0000fff090484208
x21 = 0x0000fff0901b94f8
x22 = 0x0000ffffa6600340
x23 = 0x0000000000000001
x24 = 0x0000000000000021
x25 = 0x0000fff09045a8f8
x26 = 0x0000000000000021
x27 = 0x0000000000000108
x28 = 0x0000000000000021
x29 = 0x0000fff106572880
x30 = 0x0000ffffa635068c
sp = 0x0000fff106572880
pc = 0x0000ffff2417046c
pstate = 0x0000000020000000
dcontext next tag = 0x0000ffff2417045c
computing memory target for 0x0000ffff2417046c causing SIGSEGV, kernel claims it is 0x0000000000000108
compute_memory_target: falling back to racy protection checks
opnd_compute_address for: (%x0,%x28,lsl #3)
base => 0x0000000000000000
index,scale => 0x0000000000000108
disp => 0x0000000000000108
For SIGSEGV at cache pc 0x0000ffff2417046c, computed target read 0x0000000000000108
faulting instr: ldr (%x0,%x28,lsl #3)[8byte] -> %x0
** Received SIGSEGV at cache pc 0x0000ffff2417046c in thread 1588266
record_pending_signal(11) from cache pc 0x0000ffff2417046c
not certain can delay so handling now
action is not SIG_IGN
translate context, thread 1588266 at pc_recreatable spot translating
Hi, @derekbruening We are now ready and want to contribute changes that unblock usage of DynamoRIO for JVM workloads. Internal company approval process was not so easy for us. :) In order to do that we need a public branch at official DynamoRIO repository e.g. like i3733-bug-fixes. We will be using the banch to deliver our commits into it and then send pull requests from that branch. Cloud you please help with that and create such a branch for i3733 bug fixes? Thx, Kirill
That is great news. I've sent you an invite for commit privileges so you can create your own branches. Normally we create a new temporary branch for each PR.
@kuhanov I was curious if you are still planning to contribute your changes
Hi. In general we switched to drcachesim collector. It is more stable and provide offline data for analysis. Overhead is also much less against our online collectors. One point that we probably try to improve is speedup of raw data to trace (drraw2trace tool). Currently it takes a lot of time. Thx, Kirill
But weren't all the issues you hit and the fixes you were going to contribute relating to the core of DR and so would be present in the drcachesim drmemtrace tracer too?
But weren't all the issues you hit and the fixes you were going to contribute relating to the core of DR and so would be present in the drcachesim drmemtrace tracer too?
ok. I looked in our branch and investigated what we have for core and ext in our local branch
There are 3 types of patches: fatures for enabling instruction mix, we added categories for grouping instructions:
fixes:
Workarounds. This is not product solution but these unblocked us for collecting data for java (we had limited resources to invetigate that deeper)
I suppose, we could share these patches, maybe these could be added to DRIO project backlog
Thanks, Kirill
Please share any bug fixes: otherwise someone else may hit the same problem and spend essentially wasted time re-debugging and re-fixing what is already sitting fixed in a private branch somewhere which is not a good situation. We ourselves may start running Java in the future and would not want to have to re-discover and re-fix all these things.
ok. I'll prepare review requests to have ability to link on them. maybe is there better way to share our patches? Kirill
ok. I'll prepare review requests to have ability to link on them. maybe is there better way to share our patches? Kirill
Thank you. I think a PR is good even for the ones labeled workarounds where you're not sure if it's the proper long-term approach.
ok. I'll prepare review requests to have ability to link on them. maybe is there better way to share our patches? Kirill
Thank you. I think a PR is good even for the ones labeled workarounds where you're not sure if it's the proper long-term approach.
https://github.com/DynamoRIO/dynamorio/tree/i3733-jvm-bug-workarounds
Thanks. At a quick glance we have 8 changes:
https://github.com/DynamoRIO/dynamorio/tree/i3733-jvm-bug-workarounds
we are seeing that SPECjvm 2008 runs won't even start the warm-up phase when launched with drrun. Typically specjvm runs may look like this:
with drrun we never get to this first message. I do see two threads running for short period but not convinced runs is successful since it never gets to warm-up and execution phase of the test. Although memory utilization is roughly 11GB which is quite high for sparse.small
attached log debuglevel 3 for the java pid java.log.zip
java.0.59824.zip