restatedev / sdk-typescript

Restate SDK for JavaScript/Typescript
MIT License
46 stars 8 forks source link

NodeJS Cluster support #431

Open mupperton opened 2 months ago

mupperton commented 2 months ago

I regularly make use of the NodeJS cluster module in "normal" API services, as JS is single threaded, and we want to make use of all available parallelism on servers that have multiple cores/threads

This however appears to not work as expected for a restate-registered service

My observations are that requests from the restate server to the NodeJS service appear to have a "sticky session" or using some kind of Keep-Alive, as they appear to always use the same worker process for requests in a short time span, and it requires no requests for approx ~90 seconds (from basic testing) before another worker process will be used instead, and of course then that becomes the sticky worker until another ~90 seconds have passed

However this ultimately defeats the point of the cluster module, as it's designed to improve concurrency, but all concurrent requests will be handled by the same worker currently

Likely this is a side effect HTTP2 being used?

I haven't tried this with another runtime like Bun

igalshilman commented 1 month ago

I'm not familiar with the cluster model for node, i'll take a look!

Likely this is a side effect HTTP2 being used?

This could be the case, as a single TCP connection is established, and then invocations are multiplexed within in as h2 streams.

Meanwhile I'd like to propose some alternatives:

More pods

If you are running on a bare metal, consider using

alternatively if you are running on a bare metal, consider deploying more NodeJS process with Ngnix/Caddy as a reverse proxy in front of them (all in the same box reverse proxying to local host)

igalshilman commented 1 month ago

One additional tought:

igalshilman commented 1 month ago

Can confirm that this is indeed due to HTTP2, as the cluster module load balances per physical TCP connection, while HTTP2 keeps a single TCP connection but multiplex the streams on a single connection.

I've tried to look at ways to deal with this, and it seems that they require application side (pretty complicated) load balancing. Let me know if the alternative approaches are enough.

mupperton commented 1 month ago

Thanks @igalshilman - my use case is not really CPU bound, more just wanting the ability for a single Node service that has many handlers to have better parallelism for handling multiple requests concurrently as Node is single-threaded, so the Worker API would probably be worse performance

We can try multiple pods and verify our network load balancing is working

mupperton commented 4 weeks ago

Our pod scaling and load balancing is working as expected

I'll leave it up to you if you think it's worth keeping this issue open if there is a chance you may consider supporting this in the future, otherwise you can close