laravel / horizon

Dashboard and code-driven configuration for Laravel queues.
https://laravel.com/docs/horizon
MIT License
3.87k stars 657 forks source link

Jobs stuck in pending state when using multiple horizons #1287

Closed tklie closed 1 year ago

tklie commented 1 year ago

Horizon Version

5.16.1

Laravel Version

10.13.5

PHP Version

8.2

Redis Driver

Predis

Redis Version

4.0.9

Database Driver & Version

No response

Description

I am running multiple Laravel applications connected to the same Redis, each using their own Horizon. I am currently facing an issue, where jobs randomly get stuck in the pending state (according to the dashboard) without ever being processed.

This only happens when the applications are configured to use the same Redis DB for their queues. As far as I can tell, this seems to be caused by the keys for the queues not being prefixed with the HORIZON_PREFIX. Thus, all Horizon instances are trying to migrate jobs from delayed or reserved to notify at the exact same time, seemingly causing race conditions / deadlocks - especially if some of the applications have queues with the same name.

The Laravel\Horizon\RedisQueue extends the Illuminate\Queue\Queue, but does not override the getQueue method., which would be an, at first glance, obvious approach to fix this:

public function getQueue($queue)
{
    return config('horizon.prefix').'queues:'.($queue ?: $this->default);
}

Before submitting a PR, I wanted to ask if there is a specific reason this has not been done.

Steps To Reproduce

  1. Configure the applications like seen at the bottom
  2. Set env variables in all apps
    REDIS_HORIZON_META_DB=1
    REDIS_HORIZON_QUEUE_DB=2
  3. Start dispatching many jobs in the first application like so (incrementing the id of course, some of the jobs implement ShouldBeUniqueUntilProcessing; delay is not relevant for the issue to occur, this also happens to non-delayed jobs)
    dispatch(new TestJob(id: 0))
    ->onConnection('horizon-redis')
    ->onQueue('mail')
    ->delay(now()->addSeconds(5));
  4. Inspect the pending jobs section of the Horizon dashboard and you will see jobs starting to pile up that never get processed.

Redis CLI

On inspecting the Redis CLI we notice that anything written to the Meta DB gets properly prefixed with the HORIZON_PREFIX - anything in the Queue DB does not. I think this might be causing the issue, correct me if I'm wrong.

SELECT 0
KEYS *
1) "laravel_unique_job:App\\Jobs\\TestJob0"

SELECT 1
KEYS *
1) "app_1_horizon:pending_jobs"
2) "app_1_horizon:recent_jobs"
3) "app_2_horizon:pending_jobs"
4) "app_2_horizon:recent_jobs"
5) "app_1_horizon:a7b3d4c1-9cff-41a0-9a65-14f05d2ee450"
6) many more jobs...

SELECT 2
KEYS *
1) "queues:mail:delayed"

Temporary solution

What solves the issue, is setting the queue connections for each application to different databases:

# Application 1
REDIS_HORIZON_META_DB=1
REDIS_HORIZON_QUEUE_DB=2

# Application 2
REDIS_HORIZON_META_DB=1
REDIS_HORIZON_QUEUE_DB=3

# Application 3
REDIS_HORIZON_META_DB=1
REDIS_HORIZON_QUEUE_DB=4

# and so on ...

Configuration

// database.php (shared accross all applications)

return [
    // ... database configuration

    'redis' => [
        'cluster' => env('REDIS_CLUSTER', false),
        'client'  => env('REDIS_CLIENT', 'predis'),

        'default' => [
            'host'     => env('REDIS_HOST', 'localhost'),
            'password' => env('REDIS_PASSWORD', null),
            'port'     => env('REDIS_PORT', 6379),
            'database' => 0,
        ],

        'horizon-meta' => [
            'host'     => env('REDIS_HOST', '127.0.0.1'),
            'password' => env('REDIS_PASSWORD', null),
            'port'     => env('REDIS_PORT', 6379),
            'database' => env('REDIS_HORIZON_META_DB', 1),
        ],

        'horizon-queue' => [
            'host'     => env('REDIS_HOST', '127.0.0.1'),
            'password' => env('REDIS_PASSWORD', null),
            'port'     => env('REDIS_PORT', 6379),
            'database' => env('REDIS_HORIZON_QUEUE_DB', 2),
        ],
    ]
];
// queue.php (shared across all applications)

return [
    'default' => env('QUEUE_DRIVER', 'redis'),

    'connections' => [
        // ... non-redis configuration

        'redis' => [
            'driver'      => 'redis',
            'connection'  => 'default',
            'queue'       => 'default',
            'retry_after' => 90,
        ],

        'horizon-redis' => [
            'driver'      => 'redis',
            'connection'  => 'horizon-queue',
            'queue'       => 'default',
            'retry_after' => 90,
        ],
    ]
];
// horizon.php (application 1)

return [
    // ...

    'use' => 'horizon-meta',

    'prefix' => 'app_1_horizon:',

    // ...

    'defaults' => [
        'supervisor-1' => [
            'connection'          => 'horizon-redis',
            'queue'               => ['app-1-queue', 'mail'],
           // ...
        ],
];
// horizon.php (application 2)

return [
    // ...

    'use' => 'horizon-meta',

    'prefix' => 'app_2_horizon:',

    // ...

    'defaults' => [
        'supervisor-1' => [
            'connection'          => 'horizon-redis',
            'queue'               => ['app-2-queue', 'mail'],
           // ...
        ],
];

Same for the other 3 applications.

driesvints commented 1 year ago

This only happens when the applications are configured to use the same Redis DB for their queues.

You can't use the same Redis DB for different Horizon setups, this is not supported sorry. Please see https://laravel.com/docs/10.x/horizon#configuration

Screenshot 2023-06-21 at 14 01 32