Open patricker opened 3 years ago
The safest approach would be to start a number of child processes from your root process. You can do this with the standard Process Class. This would also be the least likely to encounter any strange errors.
The simplest approach, and one I recommend testing out before going all out is just to wrap a bunch of TaskClient.Listen()
methods in a Task.Run() and awaiting all of them. Something like this:
await Task.WaitAll(new [] {
Task.Run(() => {
var taskClient = new TaskClient(TaskQueue.Redis(redisConnectionString));
await taskClient.Listen();
}),
Task.Run(() => {
var taskClient = new TaskClient(TaskQueue.Redis(redisConnectionString));
await taskClient.Listen();
})
});
Make sure with the second approach you don't share a TaskClient between threads.
Also it's similar to what we do in the tests, except tasks are executed off the queue directly in multiple threads: https://github.com/brthor/Gofer.NET/blob/master/Gofer.NET.Tests/GivenATaskSchedulerInAnotherThread.cs#L51
I was playing with the code, and went in a completely different direction. I added an override to Listen so you can define how many threads to start. It partially works... for some reason sometimes all the threads run, other times it acts like there is a lock somewhere?
Not a real PR, just a convenient place to see a DIFF: https://github.com/brthor/Gofer.NET/pull/51/files
In testing your example, it looks like it's really the TaskQueue that is the thread block somehow? If I use your example, but use a single TaskQueue for all TaskClient instances, then it won't run more than one job at a time. I have to give each Task Client it's own TaskQueue.
var tasks = new List<Task>();
for(int i = 0; i < 5; i++)
{
tasks.Add(Task.Run(async () =>
{
var taskClient = new TaskClient(TaskQueue.Redis("10.1.1.1:6379"));
await taskClient.Listen();
}));
}
Task.WaitAll(tasks.ToArray());
The redis connection held by the TaskQueue will likely be a bottleneck if it's shared.
Regarding your PR #51, having a user-defined number of task listeners in the TaskClient is interesting since it avoids the need to run multiple scheduler threads.
IIRC there are some concerns with the default number of threads available to the standard Task Executor.
I'm going to think on it some more.
Yeah, the PR is just some test code. Take your time to think about the right approach :) For now, your approach is working.
I've been using the library as a distributed job processor across over a dozen VM's, but scaling has not been the best experience.
Right now I have a Linux VM that runs a dotnet process as a systemd service. That process represents a single executor. I templatized the service so I can spin up multiple copies, and run 20 instances of the dotnet service per server. I can then replicate this across more servers as needed.
But I find that I don't do a good job of estimating how much CPU usage my jobs actually take, and adding/removing systemd services is a bit of a nuisance across more than a dozen VM's. I'd prefer if I could define the number of executors in code and run a single process with multiple job executors.
So a single VM, with a single dotnet service, that contains multiple job listeners in the process.
Any ideas on how to implement this? I was thinking maybe a thread pool, but I wasn't sure what your experience had been.