google-code-export / appengine-devappserver2-experiment

Automatically exported from code.google.com/p/appengine-devappserver2-experiment
0 stars 0 forks source link

devappserver2 locks with a pthread_cond_wait: Resource busy #24

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Run the devappserver2 process for a period of time on a complex project 
2. A "pthread_cond_wait: Resource busy" occurs, and the server stops responding 
(but it can be killed with CTRL-C)

What is the expected output? What do you see instead?

One would expect this to not occur. :)

What version of the product are you using? On what operating system?

0.4 of devappserver2 on Mac OS X 10.8.3

Please provide any additional information below.

I am happy capture and to send more information on request, since I realize the 
above isn't much to go on.

Cheers

Original issue reported on code.google.com by brianmh...@gmail.com on 19 Jan 2013 at 2:13

GoogleCodeExporter commented 9 years ago
Do you see "pthread_cond_wait: Resource busy" in the context of an exception? 
If so, could you send the entire traceback?

Original comment by bquin...@google.com on 21 Jan 2013 at 3:25

GoogleCodeExporter commented 9 years ago

Original comment by bquin...@google.com on 21 Jan 2013 at 3:26

GoogleCodeExporter commented 9 years ago
Thanks for accepting.

The message is not printed in the context of an exception.

The message immediately before the "pthread_cond_wait: ..." is the last log 
message printed to the screen, which appears unrelated because (a) the pthread 
is not following the same message; and (b) there is lag between the last log 
message and the pthread message. I have roughly observed at least 30 seconds 
between the last request log message and the pthread message.

The issue seemed to occur irrespective user requests i.e. spontaneously, and 
not repeatable. Joy!

It also does not occur with frequency — sometimes 8 hours pass and tens of 
thousands of requests without incident.

Is there anything I can do to capture information about the frozen process that 
may be of assistance, next time it occurs? (On OS X)

Original comment by brianmh...@gmail.com on 21 Jan 2013 at 3:42

GoogleCodeExporter commented 9 years ago
1. Can you confirm that "pthread_cond_wait: Resource busy" is the exact message 
that you see?
2. You could try running dtruss to see what system call is generating the 
message but I'm not sure if that would be helpful.

Let me know if you have any ideas ;-)

Original comment by bquin...@google.com on 22 Jan 2013 at 11:24

GoogleCodeExporter commented 9 years ago
1. The exact message was "pthread_cond_wait: Resource busy".
2. I will look report back with dtruss output next time this occurs.

It has not yet occurred in devappserver2 version 0.5; if it does I will post 
the details.

Original comment by brianmh...@gmail.com on 22 Jan 2013 at 2:41

GoogleCodeExporter commented 9 years ago
The issue recurred in devappserver2 v.0.5. Exact output was:

pthread_cond_wait: Resource busy Abort trap: 6


This is the first time that the "Abort trap:" line has been printed, and the 
first time devappserver2 crashed. Previous occurrences of the pthread_cond_wait 
would result in locking up, and I would terminate the process with Ctrl-C.

Even though the output is different and the process now self-terminates, I 
assume that the underlying problem continues, being related to 
pthread_cond_wait.

I have attached the "Problem Details and System Configuration" from the Mac 
popup "Problem Report for Python", in case that may be of any assistance.

Original comment by brianmh...@gmail.com on 23 Jan 2013 at 12:42

Attachments:

GoogleCodeExporter commented 9 years ago
Sam, does the attachment mean anything to you? I'm surprised by the large 
number of threads.

brianmhunt: Are you using backends, background threads or anything exotic like 
that?

Original comment by bquin...@google.com on 24 Jan 2013 at 10:58

GoogleCodeExporter commented 9 years ago
I am using Flask, and a deferred process, but nothing that I would call exotic.

In case you as suspicious it may be Flask, here are a couple baselines (to save 
you googling):

- https://github.com/kamalgill/flask-appengine-template
- http://f.souza.cc/2010/08/flying-with-flask-on-google-app-engine.html

Let me know if there is more info I can provide that may be of assistance.

Original comment by brianmh...@gmail.com on 24 Jan 2013 at 1:33

GoogleCodeExporter commented 9 years ago
Another incident occurred, with output `pthread_cond_wait: Resource Busy` and a 
lock-up i.e. the process did not terminate with `Abort trap: 6`.

Attached is the output of:

$ sudo dtruss -aces -p 56783 > dtruss-2013-01-29.txt 2>&1

Incidentally, the output without the `-s` option is:

gettimeofday(0x104621D70, 0x0, 0x4)      = 1359470321 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0
select(0x0, 0x0, 0x0, 0x0, 0x104621D70)      = 0 0

... which I didn't think was particularly helpful.

I hope the above provides a little illumination. I'm always happy to try 
anything that may help.

Original comment by brianmh...@gmail.com on 29 Jan 2013 at 2:43

Attachments:

GoogleCodeExporter commented 9 years ago
It may be worth noting that when I captured the above dtruss output I 
terminated the process with Ctrl-C after around 2-3 seconds.

To be more precise I probably should have used ulimit or GNU timeout. I was 
lazy. :)

Original comment by brianmh...@gmail.com on 29 Jan 2013 at 2:47

GoogleCodeExporter commented 9 years ago
This may be a red herring, but I have noted a reasonably consistent correlation 
that may be helpful. In the last four cases of this issue, it has always 
occurred after my computer has woken from its overnight sleep.

Original comment by brianmh...@gmail.com on 31 Jan 2013 at 2:29

GoogleCodeExporter commented 9 years ago
I think you can safely scratch my last comment. devappserver2 locked twice 
today while in use (i.e. the computer was not asleep or waking from sleep).

Original comment by brianmh...@gmail.com on 5 Feb 2013 at 12:30