LLDB on Mac OS X crashes while running TestLongjmp

Quuxplusone commented 11 years ago


Bugzilla Link	PR16769
Status	NEW
Importance	P normal
Reported by	Daniel Malea (daniel.malea@intel.com)
Reported on	2013-08-01 14:12:48 -0700
Last modified on	2014-09-23 04:29:20 -0700
Version	unspecified
Hardware	Macintosh MacOS X
CC	mhogstrom@emailgroups.net
Fixed by commit(s)
Attachments	`longjump_patch.diff` (1825 bytes, text/plain) `longjump_patch.diff` (1869 bytes, text/plain)
Blocks
Blocked by
See also

To reproduce, remove the @skipIfDarwin decorator from TestLongjmp.py and run:

python dotest.py --executable <path-to-lldb> -p TestLongJmp

The crash is due to one of two assertion failures:

Assertion failed: (!"Didn't get sequence mutex for read register."), function
ReadRegisterBytes, file /Users/daniel/dev/lldb/source/Plugins/Process/gdb-
remote/GDBRemoteRegisterContext.cpp, line 255

or

Assertion failed: (m_failure_message.empty()), function Unlock, file
/Users/daniel/dev/lldb/source/Host/common/Mutex.cpp, line 353

The problem does not seem reproducible on Linux, but happens on Mac OS X
regardless of the test compiler used (I tried ICC and Clang).

Quuxplusone commented 10 years ago

I analysed the bug, but I am unable to reproduce the crash.
I have however found some flaws in the TestLongjmp test.

llvm/tools/lldb/test/functionalities/longjmp/

// snippet from the test code

void do_jump(void)
{
    // We can't let the compiler know this will always happen or it might make
    // optimizations that break our test.
    if (!clock())
        longjmp(j, 1); // non-local goto
}

On my Mac OS X 10.9.5 machine, the longjmp is never called.

(lldb) dis
a.out`do_jump at longjmp.c:16:
   0x100000ee0:  pushq  %rbp
   0x100000ee1:  movq   %rsp, %rbp
   0x100000ee4:  callq  0x100000f5c               ; symbol stub for: clock
-> 0x100000ee9:  cmpq   $0x0, %rax
   0x100000eef:  jne    0x100000f06               ; do_jump + 38 at longjmp.c:21
   0x100000ef5:  leaq   0x134(%rip), %rdi
   0x100000efc:  movl   $0x1, %esi
   0x100000f01:  callq  0x100000f62               ; symbol stub for: longjmp
   0x100000f06:  popq   %rbp
   0x100000f07:  retq

a.out`do_jump + 40:
   0x100000f08:  nopl   (%rax,%rax)
(lldb) register read rax
     rax = 0x000000000000149b
(lldb)

jne = jnz = jump if not zero.

On my machine the function "clock" always returns a positive integer.
The documentation for function "clock" gives.

"The clock() function determines the amount of processor time used since the
invocation of the calling process, measured in CLOCKS_PER_SECs of a second."

I tried the clock function in linux under a virtual machine. It gives back 0,
and the longjmp is called. After that I tried it in linux on an old powerpc mac
I have. Same result. It gives 0 back. I think the test does not consider
different clock resolutions. It seems that the Mac has a high resolution timer,
which makes the test skip the call to the longjmp function, which it doesn't on
other architectures. I would consider replacing the call to clock with a test
variable declared volatile.

I changed the code to

volatile int WillEnter = 1;

void do_jump(void)
{
    // We can't let the compiler know this will always happen or it might make
    // optimizations that break our test.
    if (WillEnter)
        longjmp(j, 1); // non-local goto
}

This way longjmp is always called.
Now the 3 tests pass under Mac OS X 10.9.5
Some one must have fixed the error handling in lldb, because it is not crashing
anymore. Nevertheless the test itself should be corrected.

I saw that the tests were not run for FreeBSD either,
it might be worth rerunning the test to see if it also passes there.

Quuxplusone commented 10 years ago

Attached longjump_patch.diff (1825 bytes, text/plain): Proposed correction for the test case

Quuxplusone commented 10 years ago

Attached longjump_patch.diff (1869 bytes, text/plain): Proposed correction for the test case - v2

Quuxplusone / LLVMBugzillaTest

LLDB on Mac OS X crashes while running TestLongjmp #16768