clustc / google-breakpad

Automatically exported from code.google.com/p/google-breakpad
0 stars 0 forks source link

blocking on sys_wait() #527

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.centos 4.3
2.running multiple instances of the program
3.

What is the expected output? What do you see instead?

expected:
    breakpad should write minidump.

instead;

    process hang.

  I tried to use gdb to attach the hang process.
  stack-trace showed that process was blocking on:

  Linux_Ptrace_dump.cc: SuspendThread().

  technically speaking , the crashing threah was blocking on sys_wait() from SuspendThread().

  I have tried every ways I can think of to debug.
  But I still could not find a acceptable solution.

  The bug is reproduced only when there are multiple instances of the same program.

  But the bug is not 100% to reproduce.

  This is really frustrating.
  I need some advises.

  Thank you.

What version of the product are you using? On what operating system?

 latest.

Please provide any additional information below.

Original issue reported on code.google.com by issac.xw...@gmail.com on 22 Apr 2013 at 1:17

GoogleCodeExporter commented 9 years ago

  I am not able to edit my post.

  I am using breakpad on x86-64.

Original comment by issac.xw...@gmail.com on 22 Apr 2013 at 1:26

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
a few screenshots of the hang process.

Original comment by issac.xw...@gmail.com on 27 Apr 2013 at 9:49

Attachments:

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
here is a simple discription of what happened:

let us use "pc" to denote the crashed process.
and in ExceptionHandler, a child process is cloned from "pc".
let us denote this child process as "pc2".

then pc2 trys to suspend all threads of "pc" by calling SuspendThread() on each 
thread of the pc.

SuspendThread() actually calls sys_ptrace() to stop the target thread.
and then pc2 will wait for the target thread to stop by calling sys_waitpid()

In normal case, sys_waitpid() should return as expected, since sys_trace() will 
guarantee that the target will be stopped.

but in my case , It seems that the target thread somehow does not actually stop.
so pc2 waits forever.

but the process status from command "ps -ux" shows that the target thread is 
actully being traced, which means target thread is actually stopped, but 
somehow the tracer is not aware of that?? 

how could a process get traced while the tracer fails on wait() ???

Original comment by issac.xw...@gmail.com on 27 Apr 2013 at 10:10

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I figured it out myself.

somebody has installed signal handler for signal SIGCHLD.
that is why sys_waitpid() blocking there.

I have a patch.
please see attached.

Original comment by issac.xw...@gmail.com on 30 Apr 2013 at 5:36

Attachments: