Closed stephenson closed 12 years ago
Saw this tonight as well. Pow was working fine. I ran a touch tmp/restart.txt, and noticed I could no longer resolve myapp.dev (it would sping for a bit before erroring out). Then my fans took off and I saw Pow eating up 99%
Me too :-/ Killing the pow process brought it back to normal, though.
I've had the same thing happen to me a few times. Killing it works to bring it back to normal.
Just happened to me too :/
Confirmed.
On top of the above, if the process is still running when the machine goes in to sleep (Mac Air in my case) the system becomes unusable on returning from sleep and a hard reset is required.
I have also been experiencing this. Twice this week an app has been hanging and pow at 100% CPU. Immediately after trying to kill the process via activity monitor my system locks up completely requiring a hard reboot.
Any word on a solution for this yet?
happening to me as well. Giving me wake from sleep issues. I have enough issues with sleep as it is... I don't need to pass these along to my Mac :-|
Same issue happening here. Have not dug in any deeper to see what the underlying issue is though.
Same for me as well, except I was able to get it up to 99.9%.
Has anyone addressed this issue? I've looked through the source briefly, but I can't seem to find where the leak is coming from..
it seems to be happening when i wake from sleep and/or change network settings. maybe it's related to ipfw?
Sent from my iPhone
On May 6, 2011, at 7:09 AM, sethvargoreply@reply.github.com wrote:
Same for me as well, except I was able to get it up to 99.9%.
Has anyone addressed this issue? I've looked through the source briefly, but I can't seem to find where the leak is coming from..
Reply to this email directly or view it on GitHub: https://github.com/37signals/pow/issues/99#comment_1111426
Also having this issue exactly as described above.
I'm also experiencing this a lot.
This still hits me 2-3 times per week with the latest version (0.3.1). As a workaround to my hard freezing issue when trying to terminate pow from activity monitor, kill -9 pid does the trick without locking my system up.
Force quitting seems to have fixed it.
pow immediately restarts after force quitting. I am running 0.3.1 and believe this is related to the sleeping when I open and close my macBook. I think jamillion my be on to something.
Hey folks,
I use Pow every day and haven't run into this issue yet. I believe it's only a problem for a subset of Pow users, but I understand how frustrating it must be for those of you who are affected.
Yesterday I was able to see the issue firsthand on my coworker @trevorturk's computer. We opened Activity Monitor and sampled the running Pow process: https://gist.github.com/a61f3f001bac42987203 Then we killed the Pow process to bring things back to normal.
I believe the problem must lie in the Node.js event loop, and I'm afraid further debugging is out of my league at this point. Pow itself does not have any unbounded loops -- everything happens in response to IO events. (You can see the node::IOWatcher::Callback
and ev_invoke_pending
calls in the trace above.)
So the best thing we can do for now is to collect as much data as possible when the problem occurs. What exactly are you doing when the issue arises? Does Pow have any child processes? What other applications are you running? Etc.
A few people have reported that this seems to be related to sleep/wake, but I suspect it's a red herring, as Trevor's computer had been running for several hours without sleeping.
Thanks for your patience and help.
Attempting to summon people who would be interested: @ry @felixge
@sstephenson, et al. My first thoughts here would be: what version of node.js is everyone running? Considering you haven't run into the issue yet, maybe we'll get lucky: i.e. it's already fixed in 0.4.8 and everyone is just running < 0.4.8.
@indexzero Thanks. Pow 0.3.0 and 0.3.1 ship with a Node v0.4.7 x64 binary:
~% file Library/Application\ Support/Pow/Current/bin/node
Library/Application Support/Pow/Current/bin/node: Mach-O 64-bit executable x86_64
~% Library/Application\ Support/Pow/Current/bin/node -v
v0.4.7
Direct link to the package: http://get.pow.cx/versions/0.3.1.tar.gz
@sstephenson Thanks. Is there anyway to force pow to use another version of node.js? As a first step for people on this thread, upgrading is prudent since v0.4.8 is out (and stable).
Obviously need more data here, but upgrading will reduce the surface area of bugs and potentially fix the issue. Also in v0.4.8 node "No longer compile out asserts and debug statements in normal build." so debugging will be a little easier.
If my memory serves me you can see that output from debug=true node myprogram.js
. This will help us find where in the javascript boundaries we are entering this state.
@indexzero I can't seem to find any documentation for the debug=true
environment variable. Got a link?
I'll push out a Pow 0.3.2 release with Node 0.4.8 soon.
@sstephenson It is not well documented, but if you look at this code: https://github.com/joyent/node/blob/master/lib/http.js#L32
if (process.env.NODE_DEBUG && /http/.test(process.env.NODE_DEBUG)) {
debug = function(x) { console.error('HTTP: %s', x); };
} else {
debug = function() { };
}
So it looks like I was wrong. It's NODE_DEBUG=true node myapp.js
@indexzero You need to specify the module in the ENV var so. NODE_DEBUG=http,nack
would output debug lines for the node's http module and nack.
@josh Thanks!
@sstephenson would be really helpful as we dive into this issue to enable this node debugging from pow.
It sounds like its sitting in a busy loop caused by level-triggered actions for an IOWatcher. i.e. the kernel keeps waking it up saying "that fd is writable!" but Pow is not disabling it somehow. My impression for the above discussion that this is a rare event? Understanding where in javascript it's looping would help.
If the process enters this state you can kill it with SIGUSR1 and connect to it on port 5858 with the debugger - which will allow you to get a javascript stacktrace.
This bit me a lot this morning -- node 0.4.8. Force-quit worked fine
@johnpaulashenfelter Thanks. Did you try hooking up the debugger?
I also had this issue on sleep-disabled Mac.
Upgrading to node v0.4.8 has not alleviated this issue for me. I have been hit with it twice today. Once when firing off a new page request and a second time spontaneously.
Alright, this issue just bit me. I killed Pow with USR1 and was able to connect via ndb:
ndb> bt
#00 dn_find(src=#<Buffer>, domain=0, msg=#<Buffer>, dnptrs=#<Array>, ndnptr=0, lastdnptr=2) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 1109 column 4 (position 25451)
#01 ns_name_pack(src=#<Buffer>, srcn=0, dst=#<Buffer>, dstn=22, dstsiz=65513, dnptrs=#<Array>, lastdnptr=25) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 832 column 8 (position 20331)
#02 ns_newmsg_rr(handle=#<ns_newmsg>, sect=1, name=#<Buffer>, type=1, rr_class=1, ttl=600, rdlen=4, rdata=#<Buffer>) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 1663 column 6 (position 37214)
#03 #<ServerResponse>.writeOnce(buf=#<Buffer>, bufsiz=65535) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 2526 column 9 (position 56344)
#04 #<ServerResponse>.sendTo(socket=#<DnsServer>, port=65466, host=127.0.0.1) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 2540 column 16 (position 56645)
#05 #<ServerResponse>.send() /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 2605 column 7 (position 58099)
#06 #<DnsServer>.handleRequest(req=#<ServerRequest>, res=#<ServerResponse>) /Users/sam/Dropbox/Projects/pow/lib/dns_server.js line 37 column 18 (position 1657)
#07 #<DnsServer>.handleRequest(#<ServerRequest>, #<ServerResponse>) /Users/sam/Dropbox/Projects/pow/lib/dns_server.js line 3 column 63 (position 162)
#08 #<DnsServer>.emit(type=request, #<ServerRequest>, #<ServerResponse>) events.js line 67 column 17 (position 2560)
#09 #<DnsServer>.messageListener(msg=#<Buffer>, rinfo=#<Object>) /Users/sam/Dropbox/Projects/pow/node_modules/ndns/lib/ndns.js line 2640 column 8 (position 58906)
So it looks like this might be an issue with the bundled ndns library. I'll take a closer look soon.
Any updates on this issue?
I am also having this issue. Any resolution yet?
I've been using my gem, powify
to manage the server... When I notice that Pow is using a bunch of CPU, I just restart the server (powify server restart
) and it's back to normal in about 15 seconds. This is, of course, not a long-term fix and doesn't address the root of the problem... It is, however, easier than opening up activity monitor and/or killing the process manually.
Thanks for the tip! I'll try that out
On Jul 17, 2011, at 11:54 AM, sethvargoreply@reply.github.com wrote:
I've been using my gem,
powify
to manage the server... When I notice that Pow is using a bunch of CPU, I just restart the server (powify server restart
) and it's back to normal in about 15 seconds. This is, of course, not a long-term fix and doesn't address the root of the problem... It is, however, easier than opening up activity monitor and/or killing the process manually.Reply to this email directly or view it on GitHub: https://github.com/37signals/pow/issues/99#issuecomment-1591027
No problem!
Had this happen again after upgrading to 0.3.2. Killed pow and the next launch of it seems happy. Thanks for continuing to try and solve this.
If you see this happen again, please try to connect with the debugger and get a backtrace.
$ npm install -g ndb
$ kill -USR1 $insert_pow_pid_here
$ ndb
ndb> bt
Absolutely, will do!
I just upgraded about 30 minutes ago and got it again, this is the result of the bt
command:
#00 dn_find(src=#<Buffer>, domain=0, msg=#<Buffer>, dnptrs=#<Array>, ndnptr=0, lastdnptr=2) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1109 column 4 (position 25436)
#01 ns_name_pack(src=#<Buffer>, srcn=0, dst=#<Buffer>, dstn=26, dstsiz=65509, dnptrs=#<Array>, lastdnptr=25) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 832 column 8 (position 20326)
#02 ns_newmsg_rr(handle=#<ns_newmsg>, sect=1, name=#<Buffer>, type=1, rr_class=1, ttl=600, rdlen=4, rdata=#<Buffer>) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1663 column 6 (position 37192)
#03 #<ServerResponse>.writeOnce(buf=#<Buffer>, bufsiz=65535) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2526 column 9 (position 56322)
#04 #<ServerResponse>.sendTo(socket=#<DnsServer>, port=64943, host=127.0.0.1) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2540 column 16 (position 56623)
#05 #<ServerResponse>.send() /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2605 column 7 (position 58077)
#06 #<DnsServer>.handleRequest(req=#<ServerRequest>, res=#<ServerResponse>) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 37 column 18 (position 1656)
#07 #<DnsServer>.handleRequest(#<ServerRequest>, #<ServerResponse>) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 3 column 63 (position 162)
#08 #<DnsServer>.emit(type=request, #<ServerRequest>, #<ServerResponse>) events.js line 67 column 17 (position 2560)
#09 #<DnsServer>.messageListener(msg=#<Buffer>, rinfo=#<Object>) /Users/adam/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2640 column 8 (position 58884)
@abachman, much appreciated. Back to the drawing board.
I upgraded a couple days ago and just got hit with this again.
#00 dn_find(src=#<Buffer>, domain=0, msg=#<Buffer>, dnptrs=#<Array>, ndnptr=0, lastdnptr=2) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1109 column 4 (position 25436)
#01 ns_name_pack(src=#<Buffer>, srcn=0, dst=#<Buffer>, dstn=28, dstsiz=65507, dnptrs=#<Array>, lastdnptr=25) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 832 column 8 (position 20326)
#02 ns_newmsg_rr(handle=#<ns_newmsg>, sect=1, name=#<Buffer>, type=1, rr_class=1, ttl=600, rdlen=4, rdata=#<Buffer>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1663 column 6 (position 37192)
#03 #<ServerResponse>.writeOnce(buf=#<Buffer>, bufsiz=65535) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2526 column 9 (position 56322)
#04 #<ServerResponse>.sendTo(socket=#<DnsServer>, port=58657, host=127.0.0.1) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2540 column 16 (position 56623)
#05 #<ServerResponse>.send() /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2605 column 7 (position 58077)
#06 #<DnsServer>.handleRequest(req=#<ServerRequest>, res=#<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 37 column 18 (position 1656)
#07 #<DnsServer>.handleRequest(#<ServerRequest>, #<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 3 column 63 (position 162)
#08 #<DnsServer>.emit(type=request, #<ServerRequest>, #<ServerResponse>) events.js line 67 column 17 (position 2560)
#09 #<DnsServer>.messageListener(msg=#<Buffer>, rinfo=#<Object>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2640 column 8 (position 58884)
I'm having this issue as well; glad we're getting close to solving it.
Hit with this again today. This backtrack may be redundant with yesterday's but here goes:
#00 dn_find(src=#<Buffer>, domain=0, msg=#<Buffer>, dnptrs=#<Array>, ndnptr=0, lastdnptr=2) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1109 column 4 (position 25436)
#01 ns_name_pack(src=#<Buffer>, srcn=0, dst=#<Buffer>, dstn=22, dstsiz=65513, dnptrs=#<Array>, lastdnptr=25) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 832 column 8 (position 20326)
#02 ns_newmsg_rr(handle=#<ns_newmsg>, sect=1, name=#<Buffer>, type=1, rr_class=1, ttl=600, rdlen=4, rdata=#<Buffer>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1663 column 6 (position 37192)
#03 #<ServerResponse>.writeOnce(buf=#<Buffer>, bufsiz=65535) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2526 column 9 (position 56322)
#04 #<ServerResponse>.sendTo(socket=#<DnsServer>, port=56835, host=127.0.0.1) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2540 column 16 (position 56623)
#05 #<ServerResponse>.send() /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2605 column 7 (position 58077)
#06 #<DnsServer>.handleRequest(req=#<ServerRequest>, res=#<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 37 column 18 (position 1656)
#07 #<DnsServer>.handleRequest(#<ServerRequest>, #<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 3 column 63 (position 162)
#08 #<DnsServer>.emit(type=request, #<ServerRequest>, #<ServerResponse>) events.js line 67 column 17 (position 2560)
#09 #<DnsServer>.messageListener(msg=#<Buffer>, rinfo=#<Object>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2640 column 8 (position 58884)
Third time this week:
#00 dn_find(src=#<Buffer>, domain=0, msg=#<Buffer>, dnptrs=#<Array>, ndnptr=0, lastdnptr=2) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1109 column 4 (position 25436)
#01 ns_name_pack(src=#<Buffer>, srcn=0, dst=#<Buffer>, dstn=22, dstsiz=65513, dnptrs=#<Array>, lastdnptr=25) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 832 column 8 (position 20326)
#02 ns_newmsg_rr(handle=#<ns_newmsg>, sect=1, name=#<Buffer>, type=1, rr_class=1, ttl=600, rdlen=4, rdata=#<Buffer>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 1663 column 6 (position 37192)
#03 #<ServerResponse>.writeOnce(buf=#<Buffer>, bufsiz=65535) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2526 column 9 (position 56322)
#04 #<ServerResponse>.sendTo(socket=#<DnsServer>, port=57066, host=127.0.0.1) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2540 column 16 (position 56623)
#05 #<ServerResponse>.send() /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2605 column 7 (position 58077)
#06 #<DnsServer>.handleRequest(req=#<ServerRequest>, res=#<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 37 column 18 (position 1656)
#07 #<DnsServer>.handleRequest(#<ServerRequest>, #<ServerResponse>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/lib/dns_server.js line 3 column 63 (position 162)
#08 #<DnsServer>.emit(type=request, #<ServerRequest>, #<ServerResponse>) events.js line 67 column 17 (position 2560)
#09 #<DnsServer>.messageListener(msg=#<Buffer>, rinfo=#<Object>) /Users/chris/Library/Application Support/Pow/Versions/0.3.2/node_modules/ndns/lib/ndns.js line 2640 column 8 (position 58884)
I see this too. I'm unable to kill the process successfully with USR1 though, and haven't been able to get a backtrace. Will continue to try. I've seen this with 0.3.1 and 0.3.2, am on MacOS X Lion, on latest MacBook Air, using Pow with a Rails 3.0.4 app.
This occurred for me this morning. Using firefox primarily for a single test site. Killing the process didn't seem to resolve it though. It restarted back at 99% and stayed there. I did not notice any network or power-related trigger, it just started on its own. :/
This happens 2 - 5 times a day for me now. Almost to the point of switching back to webrick or passenger.
This is happening to me often as well, maybe 1-2 times per day, I'm on 0.3.1. I'll update to 0.3.2 and get a trace if/when it happens again.
Yeah, same for me it's happens almost once a day... painful :( (OS X 10.7.1, Ruby 1.9.2 (rvm), Pow 0.3.2)
I've pushed out a special prerelease build of Pow for those who continue to be plagued by this issue. In this build, ndns has been swapped out with the much simpler dnsserver.js library, which will hopefully solve the problem once and for all.
To install, run this command:
curl get.pow.cx | VERSION=0.4.0-pre sh
and report back here if you continue to have trouble. We'll roll this into the final 0.4.0 release if all goes well.
... and the app dos not work.