Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Deadlock in "thread step-over" after other thread exits #32964

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33992
Status NEW
Importance P enhancement
Reported by Tamas Berghammer (tamas@hudson-trading.com)
Reported on 2017-07-30 06:41:18 -0700
Last modified on 2017-07-31 11:20:29 -0700
Version unspecified
Hardware PC All
CC jingham@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Compile the following program with "clang++ -std=c++11 t.cpp -g -pthread":
#include <iostream>
#include <thread>

void foo() {
  std::cout << 1 << std::endl;
  std::cout << 2 << std::endl;
}

int main() {
  std::thread t1(foo);
  std::thread t2(foo);

  t2.join();
  t1.join();
}

* Set a breakpoint at foo ("breakpoint set -n foo")
* Start running the application and expect both thread to hit the breakpoint
during the same stop. If it isn't happen (possible) then start the application
again or continue the other threads until we have 2 threads stopped at the
breakpoint.
* Continue 1 of the threads using "thread continue 2" (2 is the thread index)
* The continued thread will exit, but LLDB won't get any new stop reason so it
won't go back to a state where the application considered to be stopped
(working as intended)
* Hit "Ctrl-C" to stop the application. At this point thread 3 expected to be
still at the breakpoint.
* Select thread 3 using "thread select 3"
* Try to execute "thread step-over" what will deadlock/livelock somewhere
inside LLDB and won't return. If we interrupt the thread again with "Ctrl-C" it
seems to be stuck at the same breakpoint even though we asked it to continue.
Quuxplusone commented 7 years ago

I tried this, and it does indeed stall as you describe. That is because your "thread continue 2" told only that thread to continue. So we suspended all the other threads and continued. Once the thread you let run exits, all the remaining threads have been suspended and so nobody is going to make progress. When I got into this state, I then did:

(lldb) thread continue 1 3

and (at least on macOS) the program runs to completion. You can also use "thread resume" to the same effect.

lldb doesn't currently keep track of the intent for a "continue" operation. It would need to do that to know "hey, you were exclusively running thread A, but it no longer exists so I should let the others run". It doesn't make a distinction between this thread suspend and an explicit "thread suspend" command. Note, it does keep track of "internal suspends" like for single-stepping a thread. But all user directed suspends are treated equally, and we don't undo them lightly.

I think the simplest solution to this is that when you do a step operation on a thread that has been suspended by the user, lldb should prompt you to resume the thread before continuing. Similarly if you are about to resume the inferior but lldb has suspended all the currently extant threads, it should warn and offer to resume them.

Might also be good to mark user-suspended threads in "thread list". That would have made what was going on here clearer.