Closed dbpolito closed 1 month ago
👍🏻 this is nice, as this way we don't get bloated with too many workers, but we can still make our queues more granular for straight visibility into each of them.
What is the main difference between this and 'balance' => false
?
Well, off
won't auto scale, right? It will basically stick with the process or max process... This will scale based on the jobs, but each worker will have all queues, so it will scale up to x and each worker with queue=a,b,c
Gotcha. I do wonder if we really need a new strategy for this if off
behaves similar to this but just using maxProcesses. You have to build your server to be able to handle your queue worker load at maxProcesses anyways so why not just let it run at max all the time using the off
strategy?
Well while this is true, there are several benefits of not spending the memory just to run the workers unnecessarily:
Also this brings the functionality of queue priority, so it needs to process on the order.
I think the complexity added vs functionality might worth to more people.
I'm already using this strategy on my project and only changing to it reduced 30% of memory usage in my case... Of course it will change depending on how your queues are designed.
Another thing that would help on this matter which was discussed on other issues is starting workers on min
instead of max
, to avoid unnecessary memory peak / waste.
Should this just be the default behavior of balance => false
? I don't think the name single
really conveys much.
Well, that is what i expected to be TBH... i agree, but isn't it a "BREAKING CHANGE"? Won't break anything but will change the behavior for everyone who uses it.
I'm totally ok with that, or come up with a different name...
I'm not sure if it's a breaking change. We never actually document the scaling behavior of balance => false
and it shouldn't "break" your application as it's mainly an internal optimization for Horizon.
well makes sense... i will try to change the PR later today or tomorrow tops.
Thanks! Just mark as ready for review when you want me to take another look.
@taylorotwell updated 👍
Thanks
It might have been wise to rename the PR.
Because now the change log on the Tag is talking about a new Single balance strategy that doesn't exist:
I renamed the PR but it won't update the release i think
I updated all references. Thanks
A bit late to the party here but this seems to have broken my use of Horizon with balance = false.
I have a fairly large application which sometimes handles upwards of 200k jobs per hour on a single VPS. After updating Horizon to >= 5.26.0 it suddenly stopped processing jobs (or processed them EXTREMELY slow). Downgrading to 5.25.0 and everything is back to normal again.
This is my setup:
'environments' => [
'production' => [
'supervisor-live' => [
'connection' => 'redis-live-battles',
'queue' => ['live'],
'balance' => 'false',
'minProcesses' => 1,
'maxProcesses' => 5,
'balanceMaxShift' => 15,
'balanceCooldown' => 1,
'tries' => 3,
'timeout' => 80,
],
'supervisor-default' => [
'connection' => 'redis',
'queue' => ['urgent', 'high', 'default', 'low'],
'balance' => 'false',
'minProcesses' => 1,
'maxProcesses' => 50,
'balanceMaxShift' => 15,
'balanceCooldown' => 1,
'tries' => 3,
'timeout' => 80,
],
],
Unfortunately I am not familiar enough with the internals of Horizon to be able to debug this myself but i've narrowed it down to this PR.
This is what my horizon dashboard looks like after deploying >= 5.26.0. Basically jobs just keep piling up without being processed across the different queues.
@vilhelmjosander
Well, i also got a fairly big project and haven't noticed any difference on performance, processing ~15k jobs as usual...
Did you check the job throughput? Is that different on versions?
TBH i can't see a reason to impact performance, the only behavior difference is that it scales now... before it would always be at 50 process in your case... now it will scale from 1 to 50... but the way it process and how it process is exactly the same.
Looking at your screenshot, it seems it scaled properly and it's using 50 processes...
I would look at the job throughput / runtime... as maybe the jobs may be different.
Just to add another comment. This affected our Horizon workers because we had an old config file where only processes
was specified, instead of minProcesses
and maxProcesses
. Here is our config:
'supervisor-1' => [
'connection' => 'redis',
'queue' => ['critical'],
'balance' => false,
'processes' => 6,
'tries' => 10,
'sleep' => 2,
'timeout' => 3500,
'memory' => 1500,
'maxJobs' => 1000,
],
Horizon treats processes
as maxProcesses
. And since we didn't have a value minProcesses
it defaulted to 1.
So basically it enabled scaling for all these workers which previously didn't scale.
It didn't break anything for us, just took us a couple weeks to notice that our queues processing jobs a bit slower.
Just jumping in here, I tried out setting 'balance' => false
for my supervisors in my Horizon config, not completely sure what the benefit here would be? Does this technically mean it processes each queue synchronously, so each queue in order first? What's the benefits?
This introduces a new strategy for balance: single
What this does is, it still has scaling similar to
auto
, but instead of having one worker per queue, it still keeps all queues into the same process, following the order specified, similar to what you would have usingqueue:work --queue=a,b,c
TBH this was the behavior i expected when i started using horizon a long time ago...
I'm using it in my project and thought it would be useful for others...
I can open a PR on the Docs in case this gets accepted / merged.
Maybe the name isn't that clear, totally open to change.