nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.4k stars 28.99k forks source link

Support load balancing HTTP requests across worker threads #53074

Open mcollina opened 3 months ago

mcollina commented 3 months ago

I think we can support load balancing HTTP requests across worker threads by leveraging the same mechanisms that allows Cluster to do it. If a libuv handle can cross processes, it definitely can cross threads.

I've tried to do it outside of core in https://github.com/mcollina/tbal, but unfortunately it does not work because it requires some mangling of libuv handles that is only possible from within core.

The idea is to allow worker threads to have an libuv IPC channel similar to child processes, so that they can exchange libuv handles through it.

This should be opt-in.

juanarbol commented 3 months ago

Interesting. Could you please be a bit more specific with the implementation? Currently, moving sockets between workers is not possible.

theanarkh commented 3 months ago

Maybe we can achieve this by sharing fd 🤔(Or listen on the port in the main thread and distribute fd to child threads?).

server.js

const { Worker, isMainThread, parentPort, threadId } = require('worker_threads');
const os = require('os');
const net = require('net');
const http = require('http');

if (isMainThread) {
    const handle = net._createServerHandle('127.0.0.1', '9999')
    if (typeof handle === 'number') {
        console.error('exit', handle);
        process.exit();
    }
    const workers = [];
    for (let i = 0; i < os.cpus().length; i++) {
        new Worker(__filename).postMessage(handle.fd)
    }
} else {
    parentPort.on('message', (fd) => {
        console.log(`worker ${threadId}`)
        http.createServer((_, res) => {
            res.end(`${threadId}`)
        }).listen({fd});
    });
}

client.js

const http = require('http')
for (let i = 0; i < 10; i++) {
    http.get("http://127.0.0.1:9999", res => {
        let chunk
        res.on('data', (data) => {
            chunk = chunk ? Buffer.concat([chunk, data]) : data;
        });
        res.on('end', () => {
            console.log(chunk.toString())
        })
    })
}
juanarbol commented 3 months ago

The thing is. That handle you create on the main thread, is a socket (under the hood), and that socket/handle will be managed on your main event loop; you can't transfer that handle to another event loop.

theanarkh commented 3 months ago

All threads share the fd among process, so main thread just need to send the fd to other worker threads, then the worker threads create a socket based on fd.

mcollina commented 3 months ago

Sending the fd is possible, but we must tell the server on the main process that it should not respond to that fd, or doing anything with it.

Plus the tricky part is managing the "shutdown" of the server.

flakey5 commented 3 months ago

If this is something y'all would be interested in pursuing I'd be willing to take a swing at it :eyes:

mcollina commented 3 months ago

cc @nodejs/workers

I think this would be great.

gireeshpunathil commented 3 months ago

IIRC, events that libuv manages have extended / unused attributes. one idea can be to use one of those attribute to attach the owner thread info. loops of all the threads will poll for the event, but only the one whose thread id matches with the thread info field will process the callback, the rest (including main thread) ignore the event.

juanarbol commented 3 months ago

I think I'm not considering something in the upfront.

https://github.com/libuv/libuv/blob/d2d92b74a8327daf9652a9732454937f3529bd30/src/unix/stream.c#L539

Every time uv_accept is called (at least in unix) the first operation made is check that the client's loop is associated with the server's loop.

Is this idea more like a reversed proxy with spawned workers in a IP range? And those workers will write into the reverse proxy socket? I'm very confused with this one.

Are the workers just responsable of creating the buffers and writing into the "main" process client sockets? If that is the case, I would consider this as a passable to achieve. IMHO.

juanarbol commented 3 months ago

it may be possible to pass like an option for "http handlers" to the workers, they will compute your request, but the streams will remain on front. Not quite sure if that is what you would like to have

flakey5 commented 1 month ago

Apologizes for the delay, got a bit busy. Here's what I'm thinking based off of y'alls comments and my (beginner) understanding of libuv,

For opting in, we add some workerThreads property to the options for createServer (for http 1 and 2). This will tell how many worker threads to make. I'd like for there to be a way to just use all of the threads available, but I feel like workerThreads: true might be a little weird syntax-wise.

Two options that work around the issue that @juanarbol pointed out here with the loops needing to match up:

  1. Share the server stream across the different threads
    • I have no idea how possible this actually is, but if it is then it should have the intended result
    • This might be sketchy or slow depending on what needs to be done for it to be thread safe
      1. What @juanarbol suggested where the main thread handled the streams while the threads compute the request
    • I'd imagine the stream would hang until timeout or an event from the worker thread (data or end)
    • Also not sure if this would provide the intended result. Although this should still take a lot of load off the main thread ofc

I think the first is worth a shot, but also the second is most doable.

Wdyt? Also, what criteria to we want to consider for deciding which thread to use?

jasnell commented 1 month ago

For 1... it is not actually possible or safe to share a stream across multiple threads. Nor is it actually possible to transfer ownership of a libuv stream handle from one thread / event loop to another. What you can do is pass the open file descriptor from one thread to another and perform all of the i/o on that fd on the destination thread. I think that's what is being suggested here. I would maintain that the main thread would best handle the actual listening and accepting of incoming requests but then use a round-robin strategy to transfer opened fd's to the pool of handling worker threads. The actual libuv stream would be attached to the fd by the destination thread.

The other approach is to continue having the main thread perform all of the actual i/o on the stream, deferring work to the worker thread via MessagePort or some other more efficient internal message passing mechanism. I think this is equivalent to what @juanarbol is suggesting. However, this approach would likely not be nearly efficient enough. I experimented with this model early on in the quic work and still may end up going with this kind of approach in the future for that, but I think for TCP sockets the first option (passing the fd to the worker thread) is probably the better approach.

... what criteria ... for deciding which thread to use?

I'd say round robin in the best approach. That is, maintain a pool of N workers, each request is dispatched to the next free thread in the list. When a thread is done processing a request or is waiting on async i/o to complete, it is returned to the pool to allow a new request to be picked up. If we run out of threads, requests get queued up for the next available. At any given time, each worker thread could be working on multiple concurrent requests.

mcollina commented 1 month ago

For 1... it is not actually possible or safe to share a stream across multiple threads. Nor is it actually possible to transfer ownership of a libuv stream handle from one thread / event loop to another. What you can do is pass the open file descriptor from one thread to another and perform all of the i/o on that fd on the destination thread. I think that's what is being suggested here. I would maintain that the main thread would best handle the actual listening and accepting of incoming requests but then use a round-robin strategy to transfer opened fd's to the pool of handling worker threads. The actual libuv stream would be attached to the fd by the destination thread.

What you can transfer is the libuv handle across an ipc channel. This mechanism is what makes cluster possible.

benjamingr commented 1 month ago

Fwiw the plan for what became worker_threads initially in 2015 was to eventually replace cluster's internals with worker_threads. Petka did early experiments that showed promise.

For 1... it is not actually possible or safe to share a stream across multiple threads. Nor is it actually possible to transfer ownership of a libuv stream handle from one thread / event loop to another. What you can do is pass the open file descriptor from one thread to another and perform all of the i/o on that fd on the destination thread. I think that's what is being suggested here. I would maintain that the main thread would best handle the actual listening and accepting of incoming requests but then use a round-robin strategy to transfer opened fd's to the pool of handling worker threads. The actual libuv stream would be attached to the fd by the destination thread.

Yes that, like worker - you and Matteo are agreeing.

bnoordhuis commented 2 weeks ago

What you can transfer is the libuv handle across an ipc channel. This mechanism is what makes cluster possible.

That's correct. There's an open (but stalled and in need of adoption) pull request to make handle transfer in the same process easier: https://github.com/libuv/libuv/pull/3018