laravel / horizon

Dashboard and code-driven configuration for Laravel queues.
https://laravel.com/docs/horizon
MIT License
3.87k stars 657 forks source link

Supervisor and related queues disappear from Horizon and appear again shortly. The displayed time format is also unusual. #1108

Closed VeselinKichukov closed 2 years ago

VeselinKichukov commented 2 years ago

Description:

Hello, we recently noticed a strange behavior in Horizon. We couldn't find more on the internet but you may have seen this before. We are using 2 servers on AWS S2 with Horizon. When our queues are loaded more than usual for example 10,000 jobs, the supervisor responsible for the queue disappears for some period of time from the dashboard, and everything related to it such as the queues are missing as well. So for example on the image below if the queue "priority-40" has 10,000 jobs to run, everything that is in red rectangular (on the image below) disappears, and after a minute or so it appears again for some seconds and this cycle goes on and on. Almost as if the supervisor is restarted or terminated for a while and then rebooted again.

Screen Shot 2021-12-23 at 3 27 20 PM

Another strange behavior is the format of the time on the second image below.

Screen Shot 2021-12-22 at 3 29 54 PM

Steps To Reproduce:

Run 10,000 or 20,000 jobs with a similar configuration as below. This is our configuration files:

queue.php:

'redis' => [
            'driver' => 'redis',
            'connection' => 'default',
            'queue' => env('REDIS_QUEUE', 'default'),
            'retry_after' => 400,
            'block_for' => null,
        ], 

horizon.php:

/*
    |--------------------------------------------------------------------------
    | Horizon Domain
    |--------------------------------------------------------------------------
    |
    | This is the subdomain where Horizon will be accessible from. If this
    | setting is null, Horizon will reside under the same domain as the
    | application. Otherwise, this value will serve as the subdomain.
    |
    */

    'domain' => null,

    /*
    |--------------------------------------------------------------------------
    | Horizon Path
    |--------------------------------------------------------------------------
    |
    | This is the URI path where Horizon will be accessible from. Feel free
    | to change this path to anything you like. Note that the URI will not
    | affect the paths of its internal API that aren't exposed to users.
    |
    */

    'path' => 'horizon',

    /*
    |--------------------------------------------------------------------------
    | Horizon Redis Connection
    |--------------------------------------------------------------------------
    |
    | This is the name of the Redis connection where Horizon will store the
    | meta information required for it to function. It includes the list
    | of supervisors, failed jobs, job metrics, and other information.
    |
    */

    'use' => 'default',

    /*
    |--------------------------------------------------------------------------
    | Horizon Redis Prefix
    |--------------------------------------------------------------------------
    |
    | This prefix will be used when storing all Horizon data in Redis. You
    | may modify the prefix when you are running multiple installations
    | of Horizon on the same server so that they don't have problems.
    |
    */

    'prefix' => env(
        'HORIZON_PREFIX',
        Str::slug(env('APP_NAME', 'laravel'), '_') . '_horizon:'
    ),

    /*
    |--------------------------------------------------------------------------
    | Horizon Route Middleware
    |--------------------------------------------------------------------------
    |
    | These middleware will get attached onto each Horizon route, giving you
    | the chance to add your own middleware to this list or change any of
    | the existing middleware. Or, you can simply stick with this list.
    |
    */

    'middleware' => ['web', 'auth.basic'],

    /*
    |--------------------------------------------------------------------------
    | Queue Wait Time Thresholds
    |--------------------------------------------------------------------------
    |
    | This option allows you to configure when the LongWaitDetected event
    | will be fired. Every connection / queue combination may have its
    | own, unique threshold (in seconds) before this event is fired.
    |
    */

    'waits' => [
        'redis:' . env('QUEUE_PRIORITY_LIST', 'default') => 60,
    ],

    /*
    |--------------------------------------------------------------------------
    | Job Trimming Times
    |--------------------------------------------------------------------------
    |
    | Here you can configure for how long (in minutes) you desire Horizon to
    | persist the recent and failed jobs. Typically, recent jobs are kept
    | for one hour while all failed jobs are stored for an entire week.
    |
    */

    'trim' => [
        'recent' => 60,
        'pending' => 60,
        'completed' => 60,
        'recent_failed' => 10080,
        'failed' => 10080,
        'monitored' => 10080,
    ],

    /*
    |--------------------------------------------------------------------------
    | Metrics
    |--------------------------------------------------------------------------
    |
    | Here you can configure how many snapshots should be kept to display in
    | the metrics graph. This will get used in combination with Horizon's
    | `horizon:snapshot` schedule to define how long to retain metrics.
    |
    */

    'metrics' => [
        'trim_snapshots' => [
            'job' => 24,
            'queue' => 24,
        ],
    ],

    /*
    |--------------------------------------------------------------------------
    | Fast Termination
    |--------------------------------------------------------------------------
    |
    | When this option is enabled, Horizon's "terminate" command will not
    | wait on all of the workers to terminate unless the --wait option
    | is provided. Fast termination can shorten deployment delay by
    | allowing a new instance of Horizon to start while the last
    | instance will continue to terminate each of its workers.
    |
    */

    'fast_termination' => false,

    /*
    |--------------------------------------------------------------------------
    | Memory Limit (MB)
    |--------------------------------------------------------------------------
    |
    | This value describes the maximum amount of memory the Horizon master
    | supervisor may consume before it is terminated and restarted. For
    | configuring these limits on your workers, see the next section.
    |
    */

    'memory_limit' => 128,

    /*
    |--------------------------------------------------------------------------
    | Queue Worker Configuration
    |--------------------------------------------------------------------------
    |
    | Here you may define the queue worker settings used by your application
    | in all environments. These supervisors and settings handle all your
    | queued jobs and will be provisioned by Horizon during deployment.
    |
    */

    'defaults' => [
        'supervisor-high' => [
            'connection' => 'redis',
            'queue' => env('QUEUE_PRIORITY_LIST_HIGH', 'default'),
            'balance' => 'auto',
            'memory' => 128,
            'tries' => 3,
            'sleep' => 3,
            'timeout' => 300,
            'minProcesses' => 3,
            'maxProcesses' => 51,
            'balanceMaxShift' => 2,
            'balanceCooldown' => 1,
        ],
        'supervisor-normal' => [
            'connection' => 'redis',
            'queue' => env('QUEUE_PRIORITY_LIST_NORMAL', 'default'),
            'balance' => 'auto',
            'memory' => 128,
            'tries' => 3,
            'sleep' => 3,
            'timeout' => 300,
            'minProcesses' => 1,
            'maxProcesses' => 15,
            'balanceMaxShift' => 1,
            'balanceCooldown' => 1,
        ],
    ],

    'environments' => [
        'production' => [
            'supervisor-high' => [],
            'supervisor-normal' => [],
        ],

        'local' => [
            'supervisor-high' => [
                'maxProcesses' => 8,
                'tries' => 1,
            ],
            'supervisor-normal' => [
                'tries' => 1,
            ],
        ],
        'uat' => [
            'supervisor-high' => [
                'maxProcesses' => 8,
            ],
            'supervisor-normal' => [],
        ],
        'testing' => [
            'supervisor-high' => [],
            'supervisor-normal' => [],
        ],
        'acceptance' => [
            'supervisor-high' => [],
            'supervisor-normal' => [],
        ],
    ],
driesvints commented 2 years ago

Does the issue persist if you update to the latest Horizon version and run php artisan horizon:publish?

driesvints commented 2 years ago

Closing this issue because it's inactive, already solved, old or not relevant anymore. Feel to open up a new issue if you're still experiencing this.

SohrabZ commented 9 months ago

@VeselinKichukov We are also experiencing the same issue. Have you found a solution to this?