Closed dbpolito closed 6 years ago
Just deployed 1.2.2
to our production environment at work. Will report back what we find over the next few days.
@taylorotwell After 1.2.2
, everything works perfectly. No more orphan jobs and old code.
@tomschlick you see anything? @marianvlad thanks!
@taylorotwell just checked our logs from last week and we saw two instances of orphans
Sending TERM Signal To Process: 9750
Observed Orphan: 8781
Observed Orphan: 8862
Observed Orphan: 10459
Sending TERM Signal To Process: 10498
Observed Orphan: 8781
Observed Orphan: 8862
Observed Orphan: 12571
Weirdly enough two of the processes had the same process id, even though the deploys took place 15 minutes apart. Horizon appeared to terminate them and restart correctly so not sure how that's possible 🤷♂️
Just a notice, the horizon:purge command doesn't work as expected, so if you run it and get a single rogue process ignore it, it's not an indicator. Only if you get multiple process it means there are actual orphans.
So in this case the two orphans 8781 and 8862 are the actual orphans, however it seems that even the purge command didn't kill them so that could mean they're really stuck on a long process that the next loop didn't run yet.
What's your timeout value?
Timeout value is 1800
for most of our workers.
I'm not able to reproduce this issue anymore... So it looks like fixed to me... ❤️
As i created the ticket and seems it got fixed, i'm closing this one... We can start new tickets and mention this one if necessary.
We are encountering strange issues. Sometimes one queue is stuck and does not process any jobs. Even when supervisor is stopped there are horizon:work
processes in the list. horizon:purge
also does not help as it does not find any.
Here is our config and the problem is only with default
:
»> config('horizon')
=> [
"use" => "queue",
"prefix" => "horizon:",
"waits" => [
"redis:default" => 300,
],
"trim" => [
"recent" => 60,
"failed" => 10080,
],
"environments" => [
"production" => [
"supervisor-1" => [
"connection" => "redis",
"queue" => [
"default",
],
"balance" => "simple",
"processes" => 10,
"tries" => 0,
],
"supervisor-2" => [
"connection" => "redis",
"queue" => [
"sms",
"phone_data",
],
"balance" => "auto",
"processes" => 10,
"tries" => 0,
],
],
"local" => [
"supervisor-1" => [
"connection" => "redis",
"queue" => [
"default",
],
"balance" => "simple",
"processes" => 3,
"tries" => 3,
],
],
],
]
I'm running horizon on a latest forge machine as documented... the daemon with
php artisan horizon
and on deploymentsphp artisan horizon:terminate
but time to time i need to manually runphp artisan horizon:purge
.This is the output i just got after after hours of last release:
I can confirm it's orphans by running
htop
on tree mode (press f5) and i see these process as root process, not inside the masterphp artisan horizon
process.And also, every time i run purge, it ALWAYS wrongly see 1 process as orphan:
I haven't found a pattern yet why this is happening.