Closed dvv closed 13 years ago
That is when colorful output is not so good ;) Lemme recatch
info master started
info worker 0 spawned
info worker 1 spawned
info worker 0 connected
Error: EINVAL, Invalid argument
at IOWatcher.callback (net.js:878:24)
error worker 0 uncaught exception EINVAL, Invalid argument
warning worker 0 died
info worker 0 spawned
info worker 1 connected
info listening for connections
info worker 0 connected
info shutting down
warning kill(SIGKILL)
warning worker 0 died
warninf worker 1 died
info shutdown complete
debug - exit
I am also experiencing this issue. It seems that when the workers are spawned, one of them dies initially but then survives when it respawns and everything continues to operate normally. It also intermittently works without errors, but usually, there is an EINVAL, Invalid Argument error for at least one of the workers, causing it to respawn.
I'm on x64 Ubuntu 9.04 LTS with 2 cores using node v0.4.2 and cluster 0.3.3
almost makes me think that "listening" has improperly been emitted on Ubuntu, although I am not sure if this would be node, libev etc. I bet if you do a setTimeout() delaying the accept() calls it would work, if so I think this might be a node bug
which line to try to wrap by setTimeout()
?
any ideas on this one ?
Suspended usage so far. Though tend to treat the bug as not hard one -- eventually, all required workers do get started :)
+1
Linux 2.6.31-302-ec2 #7-Ubuntu SMP Tue Oct 13 19:55:22 UTC 2009 x86_64 GNU/Linux
Current node from github (4a9f2de956)
Error: EINVAL, Invalid argument
at IOWatcher.callback (net.js:903:24)
experiencing same issue here (Ubuntu LTS, 23bit, 1 core.)
Having the same problems on Ubuntu 10.04 LTS (lucid) using node 0.4.2 and cluster 0.4.2 - any help would be appreciated.
the same problem with node 0.4.4, cluster 0.5.2 Linux 2.6.26-2-amd64 #1 SMP x86_64 GNU/Linux
Error: EINVAL, Invalid argument at IOWatcher.callback (net.js:905:24)
Same problem on Ubuntu 10.04.1
it seems the problem is that the file descriptor is not valid, this is the line in net.js that triggers the exception: var peerInfo = accept(self.fd);
Sure. This means accept
is called before fd
has arrived to worker from IPC connection.
this is not in any way related to ubuntu.the testsuite also fails in arch linux.
good to know, just coincidence I guess
same problem on:
Linux dev2 2.6.18-194.3.1.el5.028stab069.6xen #1 SMP Wed May 26 18:35:38 MSD 2010 x86_64 x86_64 x86_64 GNU/Linux
cluster v0.5.5
So it's not limited to ubuntu after all.
Same issue here on debian 5.0
wish I could reproduce this at least once, might have to fire up a VM, gah
it's pretty easy to get those environments running with vagrant
yeah i know im just lazy :D
@dvv in lib/worker.js, around line 87, replacing the self.stdin.on('fd', ...)
line with this made the problem go away for me:
var self = this;
self.stdin.on('fd', function(fd) {
setTimeout(function() {
self.server.listenFD(fd);
}, 1000);
});
So yeah, @visionmedia, your assessment sounds right.
Sure. Wonder whether nextTick()
will do, or really some timeout must be taken?
@mnutt hmm interesting, I took a look at net.js and saw this (from listenFD): the emit() is immediate, so it's obviously expecting the syscalls to have occurred already, but since I spawn the workers before "listening" this makes perfect sense now, although kinda strange that it's fine on osx
Server.prototype._startWatcher = function() {
this.watcher.set(this.fd, true, false);
this.watcher.start();
this.emit('listening');
};
we could try just spawning the workers after listening, should still be fine
We experienced the same issue this weekend on our staging server. Additionally, the cluster was being monitored by god which caused a never-ending respawn cycle. The circumstances in which this issue occurred is not clear to me though.
@lackac yeah we need a better way to detect this, currently I have a threshold on boot of 60 seconds (in NODE_ENV=production), within which cluster will consider it a cyclic restart since it should not be restarting that soon, and will delay the restart for another 60 seconds to at least provide some sort of control instead of recursively spawning as fast as possible
that being said I made some tweeks to cluster's internals that should fix this EINVAL problem so the next release might help
+1 for next release to fix this
releasing riiiight now
@visionmedia we upgraded on our server to 0.6.1 after you released it and haven't seen this issue come up since. Thanks for the fix.
@lackac awesome sounds good, I will close this for now
I started using cluster.exception and mailbox is flooded with EINVALs. It wasn't happening before today, I was on 0.6.3. Started for seemingly no reason. I upgraded to 0.6.5 and still getting exceptions.
I'm on node 0.4.9, Ubuntu 11.04.
The error is:
Error: EINVAL, Invalid argument
at IOWatcher.callback (net.js:916:24)
damn! I found some more bugs so I'm releasing another patch-level soon
Updated to 0.6.8, so far no emails from exception, I'm cautiously optimistic :) Gonna report back by the end of the day or if I get those emails before that.
haha ok great, sounds good
Looks like EINVAL issue is solved. Or at least something else changed and it doesn't trigger, but I didn't really do anything. So, thanks! :)
interesting, that's good though :D
The simplest server
test.js
:Below is typical debug output when starting workers. What can be the reason?
info [90m- master started[0m info [90m- worker 0 spawned[0m info [90m- worker 1 spawned[0m info [90m- worker 2 spawned[0m info [90m- worker 3 spawned[0m info [90m- worker 0 connected[0m info [90m- worker 3 connected[0m Error: EINVAL, Invalid argument at IOWatcher.callback (net.js:878:24) [31merror[0m [90m- worker 0 uncaught exception EINVAL, Invalid argument[0m Error: EINVAL, Invalid argument at IOWatcher.callback (net.js:878:24) [31merror[0m [90m- worker 3 uncaught exception EINVAL, Invalid argument[0m [33mwarning[0m [90m- worker 0 died[0m info [90m- worker 0 spawned[0m [33mwarning[0m [90m- worker 3 died[0m info [90m- worker 3 spawned[0m info [90m- worker 1 connected[0m info [90m- worker 2 connected[0m info [90m- listening for connections[0m Error: EINVAL, Invalid argument at IOWatcher.callback (net.js:878:24) [31merror[0m [90m- worker 2 uncaught exception EINVAL, Invalid argument[0m Error: EINVAL, Invalid argument at IOWatcher.callback (net.js:878:24) [31merror[0m [90m- worker 1 uncaught exception EINVAL, Invalid argument[0m [33mwarning[0m [90m- worker 2 died[0m info [90m- worker 2 spawned[0m [33mwarning[0m [90m- worker 1 died[0m info [90m- worker 1 spawned[0m info [90m- worker 3 connected[0m info [90m- worker 0 connected[0m info [90m- worker 1 connected[0m info [90m- worker 2 connected[0m info [90m- shutting down[0m [33mwarning[0m [90m- kill(SIGKILL)[0m info [90m- shutdown complete[0m [33mwarning[0m [90m- worker 3 died[0m [90mdebug - exit[0m