nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.67k stars 29.63k forks source link

Add a cleaned up `child_processes` API that cleans the API up for child process spawning, like how `fs/promises` cleaned up `fs` #54799

Open dead-claudia opened 2 months ago

dead-claudia commented 2 months ago

What is the problem this feature will solve?

child_process is, in my experience, one of the most commonly wrapped APIs by far. It's especially common to wrap it in promise environments.

It all stems from a few major issues:

  1. Child processes have several one-time events, and none of them map cleanly to just one promise or even one promise sequence.
  2. It's non-trivial to create a wrapper that interops with streams more seamlessly.
  3. It's inherently error-prone to use directly, especially when it comes to shell script invocation on Windows.

What is the feature you are proposing to solve the problem?

I'm thinking of the following API, in child_process/promises:

Options and arguments:

The return value is a Promise that settles when the child terminates.

Additional classes in stream:

And in process:

Things I'm intentionally leaving out:

For a summary in the form of TypeScript definitions:

declare module "child_process/promises" {
    export interface WaitError extends Error {
        exitCode: number | undefined
        signalCode: string | undefined
    }

    type SignalSource =
        | AbortSignal
        | AsyncIterable<void>

    type SignalMap = {[Signal in NodeJS.Signals]?: SignalSource}

    type FdSource =
        | "close"
        | "null"
        | MessagePort
        | number
        | import("node:net").Socket
        | import("node:fs").FileHandle
        | Readable | Writable

    interface FdMap {
        [fd: number]: FdSource
        inherit?: boolean
    }

    interface Options {
        detached?: boolean
        cwd?: string
        env?: Record<string, string>
        argv0?: string
        uid?: number
        gid?: number
        signal: SignalMap
        ref?: boolean
        pathLookup?: boolean
        execPath?: string
        execArgv?: string
        fds: FdMap
    }

    interface ProcessHandle extends Promise<void> {
        readonly spawned: Promise<number>
    }

    export function exec(command: string | URL, options: Options): ProcessHandle
    export function exec(command: string | URL, args?: string[], options?: Options): ProcessHandle
    export function system(command: string, options: Options): ProcessHandle
    export function system(command: string, args?: string[], options?: Options): ProcessHandle
    export function fork(command: string | URL, options: Options): ProcessHandle
    export function fork(command: string | URL, args?: string[], options?: Options): ProcessHandle
}

declare module "stream" {
    declare class BufferReader extends Readable {
        constructor(source: BufferSource)
        constructor(source: string, encoding?: NodeJS.BufferEncoding)
        readonly bytesRead: number
    }

    declare class BufferWriter extends Writable {
        constructor(target: BufferSource)
        constructor(maxBytes: numbers)
        readonly bytesWritten: number
        consume(): Buffer
        toString(encoding?: NodeJS.BufferEncoding): string
    }

    interface Duplex {
        reader(): Readable
        writer(): Writable
    }
}

declare module "process" {
    export type InspectFDResult =
        | {kind: "file", readable: boolean, writable: boolean}
        | {kind: "socket", readable: true, writable: true, type: "stream-client" | "stream-server" | "dgram"}
        | {kind: "tty", readable: boolean, writable: boolean, rows: number, columns: number}
        | {kind: "ipc", readable: false, writable: false}
        | {kind: "unknown", readable: false, writable: false}

    export function ipc(fd?: number): MessagePort
    export function inspectFD(fd?: number): Promise<InspectFDResult>
}

What alternatives have you considered?

I considered:

benjamingr commented 2 months ago

What if we make the existing ChildProcess more ergonomic instead?

const child = spawn('ls', ['-lh', '/usr']); // reasonable, creates a child process
// Now I want to read its stdout and wait for it to close, this is verbose:
const output = Buffer.concat(await child.stdout.toArray()).toString();
// What if instead we could do:
const output2 = await child.text();
// Or as a one liner
console.log(await spawn('ls', ['-lh', '/usr']).text());

?

dead-claudia commented 2 months ago

What if we make the existing ChildProcess more ergonomic instead?

const child = spawn('ls', ['-lh', '/usr']); // reasonable, creates a child process

// Now I want to read its stdout and wait for it to close, this is verbose:
const output = Buffer.concat(await child.stdout.toArray()).toString();
// What if instead we could do:
const output2 = await child.text();

// Or as a one liner
console.log(await spawn('ls', ['-lh', '/usr']).text());

?

@benjamingr I also considered that (and forgot to list it) and I even tried that route first. However, I ultimately found it to be untenable:

  1. The current child_process API doesn't correctly escape stuff on Windows, and fixing Windows argument escaping would likely break cross-spawn. Thus, to avoid ecosystem fragmentation, a separate method is needed.
  2. The current semantics for options.stdio would require reserving several encoding keywords if you were to add that automatic string decoding. And sharing a namespace for child FD types and encoding strings just feels wrong.
  3. Part of the point of having new factories is to provide better defaults. For instance, always spawning without the shell window in Windows or at least doing that by default.
  4. And while this is a minor point, adding .wait() to every invocation in shell script-like contexts, possibly tens of times in just a single file, is incredibly inconvenient. One of my goals with this was to avoid needing to wrap calls in most cases, at the cost of slightly complicating advanced IPC cases.

Will note that the http.request/XMLHttpRequest to fetch shift was also an inspiration here.

aduh95 commented 2 months ago

This was also discussed back in https://github.com/nodejs/node/issues/38823

Also relevant: subprocess.readLines() proposal in https://github.com/nodejs/node/pull/45774

dead-claudia commented 2 months ago

Updated the proposal to make a number of revisions and simplifications. It's still similar, but different enough to merit a re-read.

One of my goals was to allow fork and system to be easily written as simple wrappers of exec. For example, here's system:

export function system(script, args, opts) {
    if (!Array.isArray(args)) {
        opts ??= args
        args = []
    }

    let {execPath, execArgv} = opts ?? {}

    if (process.platform === "win32") {
        execPath ??= process.env.COMSPEC || "cmd.exe"
        execArgv ??= ["/d", "/s", "/c"]
    } else {
        execPath ??= "sh"
        execArgv ??= ["-c"]
    }

    return exec(
        execPath,
        [...execArgv, ...args],
        {...opts, pathLookup: true},
    )
}

export function fork(script, args, opts) {
    if (!Array.isArray(args)) {
        opts ??= args
        args = []
    }

    let {execPath, execArgv, env, fds} = opts ?? {}

    execPath ??= "node"
    execArgv ??= []

    const normalized = {...fds}
    const ports = []

    for (const fd of Object.keys(normalized)) {
        if (!/^\d+$/.test(normalized)) continue
        const source = normalized[fd]
        if (!(source instanceof MessagePort)) continue
        if (isTransferred(source)) throw new Error("...")
        ports.push({fd, source})
    }

    let channelFds = ""

    for (const {fd, source} of ports) {
        // `convertToStream` is of course non-trivial
        normalized[fd] = convertToStream(port)
        channelFds += "," + fd
    }

    return exec(
        execPath,
        [...execArgv, ...args],
        {
            ...opts,
            pathLookup: true,
            env: channelFds ? {...env, NODE_CHANNEL_FD: channelFds.slice(1)} : env,
        }
    )
}