Open mgaudet opened 6 years ago
One other thing I should mention: Running the test suite, all the tests passed, with the exception of the HLE tests.
Hmm. Are you attempting to execute forwards or backwards after setting the breakpoint? Can you rr dump -p <trace-path> 1-832
and post that output somewhere?
This would be executing forwards
It's possible here that I hit the end of program recording. Though, if that is the case, it's not the common case, I think.
When I see this it's often because I have set a breakpoint on JIT compiled code
My guess is that we're trying to set a software breakpoint in memory that is unmapped, probably because you're reverse-executing and we execute through a region of time before the memory was mapped.
If you use hardware breakpoints (hbreak
) instead of regular breakpoints does the problem go away?
hbreak
does seem to do it! I was experiencing this forward executing as well, but I am guessing it was crashing when the code pages were unmapped as the process shuts down (looking at the recording I can now see I never hit this breakpoint, so just fly into the end of the recording).
Thanks!
Let's leave this open because rr should probably handle this in some reasonable way.
Something similar happened to me, but I can't work around with hbreak
because that happens with the implicit breakpoints set by reverse-*
commands.
I don't want to fix this myself because I need to focus on work that I might eventually get paid for. But if someone else fixes it (with a test!), I'll gladly merge their PR.
I think it would be pretty easy to fix this in AddressSpace
: silently ignore failure to place a software breakpoint, and whenever a new mapping is created, reapply all software breakpoints (or better still, just the ones that overlap the new mapping).
Using a VMWare guest (configured as suggested) on an OS/X host, I have been seeing this assertion pretty regularly, using 3a9e68ce2d4c688fe4c22c7a73ea9368fe09fcd7
(repeats a couple thousand times unfortunately)
Unfortunately, I don't have a reliable set of steps to reproduce on a generic program, though I have found that when I do encounter this issue, I encounter it again when repeating the same steps.
One thing I can point out that seems to be pretty common: When I see this it's often because I have set a breakpoint on JIT compiled code; in the above message,
0x3ad9a8252b80
is a code pointer where I set a breakpoint just prior.Unlike in #2161 I don't get an RR backtrace, and the child seems to be dead immediately, so I've been unable to follow similar debugging steps.
I have run
rr pack
and archived the directory, and can pass it on in case that's desired.(Honestly, the biggest bother so far about this bug has been the incredibly large spew of the same assertion failure)