orthecreedence / cl-async

Asynchronous IO library for Common Lisp.
MIT License
275 stars 40 forks source link

docs: new example: spawn.lisp #192

Closed hraban closed 2 years ago

hraban commented 2 years ago

Hi, I would like to ask for help with this PR, see the question in the code comment: how do you guarantee a stdout data handler is called on all data, before the event loop is cleaned up, even if the spawned process exits?

Thank you

hraban commented 2 years ago

I thought it was a race condition between process close, consuming their output, and the event loop exiting, but that may have been a red herring. When you reduce the number of subprocesses to 100 you don't get this weird behaviour:

# Spawning 1000 processes:
$ sbcl --noinform --disable-debugger --load ~/.sbclrc --script examples/spawn.lisp | wc -l
243
# Spawning 100 processes:
$ sbcl --noinform --disable-debugger --load ~/.sbclrc --script examples/spawn.lisp | wc -l
100

Compare to node.js example

const child_process = require ('child_process');

function spawn_one() {
    return new Promise((ok, bad) => {
        let r = Math.random()*10;
        const c = child_process.spawn('bash', ['-c', `sleep ${r}; echo ${r}`]);
        c.stdout.on('data', d=>console.log(d.toString().trim()));
        c.on('close' , code => {code === 0 ? ok() : bad(code)})
    });
}
Promise.all(new Array(1000).fill(null).map(spawn_one));

Always returns:

$ node test.js | wc -l
1000

There may be a limit on concurrent spawns from libuv which is silently ignored by cl-async, and maybe node.js works around that by handling that libuv limit? Just a guess.

What do you think?

hraban commented 2 years ago

It seems to have to do with the soft limit on open file descriptors:

$ ulimit -S -n
256
$ echo 'require("child_process").exec("ulimit -S -n", (e , o , err) => { console.log(o.trim()); } ), null' | node
1048575

Cf "https://github.com/nodejs/node/issues/40052".

For the purposes of this PR , I have lowered the number of child processes to 100, so it should always work on systems with reasonable limits.

I will open a separate issue to discuss the failure mode (cl-async hitting a low ulimit seems fine, but silently ignoring that error is not ideal. I'm not clear on whether libuv passes the error to cl-async, or what exactly is going wrong. tbd)

orthecreedence commented 2 years ago

Thanks for finding this and for your analysis. I'm kind of out of the lisp game these days, but if you find a way to surface the file descriptor error/condition, I'll happily merge that in. I think failing silently would be considered a bug.