`MasterCluster` worker persistence tooling

wallind commented 3 years ago

Curious to hear your thoughts on adding an option I think I need for my use case. Alternatively perhaps you can educate me as why what I'm asking for is stupid 🙃.

I'd like to add a Boolean option to the second parameter of the MasterCluster persistWorkers which would make the system eagerly start all associated workers immediately and not shut them down when waiting for jobs. In my use case I'm trying to thread an intensive computation happening on a large number of Objects that is event driven(I can give more detail if need be).

I noticed that not having the workers already started up introduces a fairly static 3-4second delay before the system can process through the data. This is undesirable as most requests I'd like to be processed through in under 10s.

I am prepared to make a PR for this as I already have a locally built copy of my patch running in my project.

Additionally just want to say that aside from my request this package appears to be doing exactly what I need it to speeding up the processing of the task by N(N being the number of CPUs I can dedicate). Really cannot thank you enough for building this 🙏

wallind commented 3 years ago

for the record all I did here to test that persisting the workers solved my problem was manipulated this one line. In a formal PR though as I mentioned I'd add it as a param

nathanschwarz commented 3 years ago

Hey there ! This is actually a feature that I wanted to implement (just never had the time). Thought if I may, a better option would be to have an Object such as

{
  eagerLoad: Int,
  keepAlive: 'default' || 'always' || time_in_ms
}

eagerLoad would be the number of cpus you want to launch at startup keepAlive: 'default' => shut down when no tasks are in the queue (current implementation) keepAlive: 'always' => never shut down keepAlive: time_in_ms => if no job is found wait the indicated time before trying again. At the end of the cooldown if no job is found, shutdown.

the default config would be { eagerLoad: 0, keepAlive: 'default' }.

To even have more grained control we could have an array of those config objects to handle multiple core "layouts" (which could be modified at runtime if needs be).

We could split the PR in 2 times (we'll handle the config array next). If you're willing to do it, it would be great 👍 😊.

wallind commented 3 years ago

Hey there ! This is actually a feature that I wanted to implement (just never had the time). Thought if I may, a better option would be to have an Object such as
{
  eagerLoad: Int,
  keepAlive: 'default' || 'always' || time_in_ms
}
eagerLoad would be the number of cpus you want to launch at startup keepAlive: 'default' => shut down when no tasks are in the queue (current implementation) keepAlive: 'always' => never shut down keepAlive: time_in_ms => if no job is found wait the indicated time before trying again. At the end of the cooldown if no job is found, shutdown.

the default config would be { eagerLoad: 0, keepAlive: 'default' }.

To even have more grained control we could have an array of those config objects to handle multiple core "layouts" (which could be modified at runtime if needs be).

We could split the PR in 2 times (we'll handle the config array next). If you're willing to do it, it would be great 👍 😊.

I'm tracking with the behavior for keepAlive and I'll get started on that right away.

For eagerLoad part of me thinks the "eager loading" should just be a side affect of having keepAlive: 'always'. I'm not averse to doing the work I just think it's good to keep things simpler when possible. I have a hard time imagining a scenario where you would want workers to eagerly load but not stay permanently on.

If I'm thinking about it right the processes don't really take up much resource-wise when they aren't doing anything and if you properly guard imports for the workers with Cluster.isMaster like you recommend there's really no downside to eagerly starting them so it would be pretty inline with keepAlive: 'always' since you're already committing to having them always on.

Very open to your thoughts.

nathanschwarz commented 3 years ago

Yes, you're right, the keepAlive field will be enough 👍. Thanks 🙏

nathanschwarz / meteor-cluster

`MasterCluster` worker persistence tooling #9