Closed hraban closed 2 years ago
I thought it was a race condition between process close, consuming their output, and the event loop exiting, but that may have been a red herring. When you reduce the number of subprocesses to 100 you don't get this weird behaviour:
# Spawning 1000 processes:
$ sbcl --noinform --disable-debugger --load ~/.sbclrc --script examples/spawn.lisp | wc -l
243
# Spawning 100 processes:
$ sbcl --noinform --disable-debugger --load ~/.sbclrc --script examples/spawn.lisp | wc -l
100
Compare to node.js example
const child_process = require ('child_process');
function spawn_one() {
return new Promise((ok, bad) => {
let r = Math.random()*10;
const c = child_process.spawn('bash', ['-c', `sleep ${r}; echo ${r}`]);
c.stdout.on('data', d=>console.log(d.toString().trim()));
c.on('close' , code => {code === 0 ? ok() : bad(code)})
});
}
Promise.all(new Array(1000).fill(null).map(spawn_one));
Always returns:
$ node test.js | wc -l
1000
There may be a limit on concurrent spawns from libuv which is silently ignored by cl-async, and maybe node.js works around that by handling that libuv limit? Just a guess.
What do you think?
It seems to have to do with the soft limit on open file descriptors:
$ ulimit -S -n
256
$ echo 'require("child_process").exec("ulimit -S -n", (e , o , err) => { console.log(o.trim()); } ), null' | node
1048575
Cf "https://github.com/nodejs/node/issues/40052".
For the purposes of this PR , I have lowered the number of child processes to 100, so it should always work on systems with reasonable limits.
I will open a separate issue to discuss the failure mode (cl-async hitting a low ulimit seems fine, but silently ignoring that error is not ideal. I'm not clear on whether libuv passes the error to cl-async, or what exactly is going wrong. tbd)
Thanks for finding this and for your analysis. I'm kind of out of the lisp game these days, but if you find a way to surface the file descriptor error/condition, I'll happily merge that in. I think failing silently would be considered a bug.
Hi, I would like to ask for help with this PR, see the question in the code comment: how do you guarantee a stdout data handler is called on all data, before the event loop is cleaned up, even if the spawned process exits?
Thank you