Closed fannyfan414 closed 1 year ago
Maybe problem here (vendor/laravel/horizon/src/RedisQueue.php)
migrateExpiredJobs with redis return empty array, but dragonfly return false
public function migrateExpiredJobs($from, $to)
{
return $this->getConnection()->eval(
LuaScripts::migrateExpiredJobs(), 3, $from, $to, $to.':notify', $this->currentTime(), $this->migrationBatchSize
);
}
migrateExpiredJobs
method fron LuaScripts
:
public static function migrateExpiredJobs()
{
return <<<'LUA'
-- Get all of the jobs with an expired "score"...
local val = redis.call('zrangebyscore', KEYS[1], '-inf', ARGV[1], 'limit', 0, ARGV[2])
-- If we have values in the array, we will remove them from the first queue
-- and add them onto the destination queue in chunks of 100, which moves
-- all of the appropriate jobs onto the destination queue very safely.
if(next(val) ~= nil) then
redis.call('zremrangebyrank', KEYS[1], 0, #val - 1)
for i = 1, #val, 100 do
redis.call('rpush', KEYS[2], unpack(val, i, math.min(i+99, #val)))
-- Push a notification for every job that was migrated...
for j = i, math.min(i+99, #val) do
redis.call('rpush', KEYS[3], 1)
end
end
end
return val
LUA;
}
Сonfirm that the error is in the return type, but perhaps somewhere else...
Redis returns an empty array
, dragonfly returns false
. I added a condition locally in the php core of the library, if the value is false, then change it to array and it seems to work ..
Thank you for debugging this!
@dranikpg , please take a look when you have time.
@romange @dranikpg thanks for fixing!
Try to test image ghcr.io/dragonflydb/dragonfly:alpha-ubuntu
and found new bug.
How to reproduce:
Using docker-compose.yml from first message try to run docker run --rm --net=app_test_network ghcr.io/six-ddc/plow http://app_laravel:8000/api/test -c 10 -d 100s
After 2-10 secs everything freezes and dragonfly crashed. Logs from 2 tests dragonfly-crash-logs-2.txt dragonfly-crash-logs-1.txt
While debugging, some note: Sometimes dragonfly is not crashed, but all workers from app is freezed for a long time (over a minute) and then return error
2023-01-23 17:30:46 [2023-01-23 14:30:45] production.ERROR: read error on connection to redis_master:6379 {"exception":"[object] (RedisException(code: 0): read error on connection to redis_master:6379 at /var/www/html/vendor/laravel/framework/src/Illuminate/Redis/Connections/Connection.php:116)
2023-01-23 17:30:46 [stacktrace]
2023-01-23 17:30:46 #0 /var/www/html/vendor/laravel/framework/src/Illuminate/Redis/Connections/Connection.php(116): Redis->lLen('queues:slow')
2023-01-23 17:30:46 #1 /var/www/html/vendor/laravel/framework/src/Illuminate/Redis/Connections/PhpRedisConnection.php(532): Illuminate\\Redis\\Connections\\Connection->command('llen', Array)
2023-01-23 17:30:46 #2 /var/www/html/vendor/laravel/framework/src/Illuminate/Redis/Connections/Connection.php(216): Illuminate\\Redis\\Connections\\PhpRedisConnection->command('llen', Array)
2023-01-23 17:30:46 #3 /var/www/html/vendor/laravel/framework/src/Illuminate/Redis/Connections/PhpRedisConnection.php(565): Illuminate\\Redis\\Connections\\Connection->__call('llen', Array)
2023-01-23 17:30:46 #4 /var/www/html/vendor/laravel/horizon/src/RedisQueue.php(32): Illuminate\\Redis\\Connections\\PhpRedisConnection->__call('llen', Array)
2023-01-23 17:30:46 #5 /var/www/html/vendor/laravel/horizon/src/AutoScaler.php(80): Laravel\\Horizon\\RedisQueue->readyNow('slow')
2023-01-23 17:30:46 #6 /var/www/html/vendor/laravel/framework/src/Illuminate/Collections/Collection.php(790): Laravel\\Horizon\\AutoScaler->Laravel\\Horizon\\{closure}(Object(Laravel\\Horizon\\ProcessPool), 'slow')
2023-01-23 17:30:46 #7 /var/www/html/vendor/laravel/horizon/src/AutoScaler.php(79): Illuminate\\Support\\Collection->mapWithKeys(Object(Closure))
2023-01-23 17:30:46 #8 /var/www/html/vendor/laravel/horizon/src/AutoScaler.php(49): Laravel\\Horizon\\AutoScaler->timeToClearPerQueue(Object(Laravel\\Horizon\\Supervisor), Object(Illuminate\\Support\\Collection))
2023-01-23 17:30:46 #9 /var/www/html/vendor/laravel/horizon/src/Supervisor.php(349): Laravel\\Horizon\\AutoScaler->scale(Object(Laravel\\Horizon\\Supervisor))
2023-01-23 17:30:46 #10 /var/www/html/vendor/laravel/horizon/src/Supervisor.php(296): Laravel\\Horizon\\Supervisor->autoScale()
2023-01-23 17:30:46 #11 /var/www/html/vendor/laravel/horizon/src/Supervisor.php(260): Laravel\\Horizon\\Supervisor->loop()
2023-01-23 17:30:46 #12 /var/www/html/vendor/laravel/horizon/src/Console/SupervisorCommand.php(100): Laravel\\Horizon\\Supervisor->monitor()
2023-01-23 17:30:46 #13 /var/www/html/vendor/laravel/horizon/src/Console/SupervisorCommand.php(75): Laravel\\Horizon\\Console\\SupervisorCommand->start(Object(Laravel\\Horizon\\Supervisor))
2023-01-23 17:30:46 #14 /var/www/html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(36): Laravel\\Horizon\\Console\\SupervisorCommand->handle(Object(Laravel\\Horizon\\SupervisorFactory))
2023-01-23 17:30:46 #15 /var/www/html/vendor/laravel/framework/src/Illuminate/Container/Util.php(41): Illuminate\\Container\\BoundMethod::Illuminate\\Container\\{closure}()
2023-01-23 17:30:46 #16 /var/www/html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(93): Illuminate\\Container\\Util::unwrapIfClosure(Object(Closure))
2023-01-23 17:30:46 #17 /var/www/html/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(35): Illuminate\\Container\\BoundMethod::callBoundMethod(Object(Illuminate\\Foundation\\Application), Array, Object(Closure))
2023-01-23 17:30:46 #18 /var/www/html/vendor/laravel/framework/src/Illuminate/Container/Container.php(663): Illuminate\\Container\\BoundMethod::call(Object(Illuminate\\Foundation\\Application), Array, Array, NULL)
2023-01-23 17:30:46 #19 /var/www/html/vendor/laravel/framework/src/Illuminate/Console/Command.php(182): Illuminate\\Container\\Container->call(Array)
2023-01-23 17:30:46 #20 /var/www/html/vendor/symfony/console/Command/Command.php(312): Illuminate\\Console\\Command->execute(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Illuminate\\Console\\OutputStyle))
2023-01-23 17:30:46 #21 /var/www/html/vendor/laravel/framework/src/Illuminate/Console/Command.php(151): Symfony\\Component\\Console\\Command\\Command->run(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Illuminate\\Console\\OutputStyle))
2023-01-23 17:30:46 #22 /var/www/html/vendor/symfony/console/Application.php(1022): Illuminate\\Console\\Command->run(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #23 /var/www/html/vendor/symfony/console/Application.php(314): Symfony\\Component\\Console\\Application->doRunCommand(Object(Laravel\\Horizon\\Console\\SupervisorCommand), Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #24 /var/www/html/vendor/symfony/console/Application.php(168): Symfony\\Component\\Console\\Application->doRun(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #25 /var/www/html/vendor/laravel/framework/src/Illuminate/Console/Application.php(102): Symfony\\Component\\Console\\Application->run(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #26 /var/www/html/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php(155): Illuminate\\Console\\Application->run(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #27 /var/www/html/artisan(35): Illuminate\\Foundation\\Console\\Kernel->handle(Object(Symfony\\Component\\Console\\Input\\ArgvInput), Object(Symfony\\Component\\Console\\Output\\ConsoleOutput))
2023-01-23 17:30:46 #28 {main}
2023-01-23 17:30:46 "}
Debug point where app got deadlock https://github.com/laravel/framework/blob/9.x/src/Illuminate/Queue/RedisQueue.php#L153
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: push
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: createPayloadArray
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: enqueueUsing
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: pushRaw
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: pushRaw start
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: queue: default
2023-01-23 18:38:23 [2023-01-23 15:38:22] production.INFO: payload: {"uuid":"8ce3e32e-562d-4700-8f6e-bc93b6252246","displayName":"App\\Jobs\\TestJob","job":"Illuminate\\Queue\\CallQueuedHandler@call","maxTries":null,"maxExceptions":null,"failOnTimeout":false,"backoff":null,"timeout":null,"retryUntil":null,"data":{"commandName":"App\\Jobs\\TestJob","command":"O:16:\"App\\Jobs\\TestJob\":2:{s:9:\"cacheUuid\";s:36:\"aaf538de-a7c3-46c2-90d4-5f1ead579f67\";s:5:\"queue\";s:7:\"default\";}"},"id":"8ce3e32e-562d-4700-8f6e-bc93b6252246","attempts":0,"type":"job","tags":[],"pushedAt":"1674488302.7366"}
30 second delay and then got another output
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: push
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: createPayloadArray
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: enqueueUsing
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: pushRaw
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: pushRaw start
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: queue: default
2023-01-23 18:38:54 [2023-01-23 15:38:53] production.INFO: {"uuid":"d952eb75-5b82-42dd-b033-6730e2b8297c","displayName":"App\\Jobs\\TestJob","job":"Illuminate\\Queue\\CallQueuedHandler@call","maxTries":null,"maxExceptions":null,"failOnTimeout":false,"backoff":null,"timeout":null,"retryUntil":null,"data":{"commandName":"App\\Jobs\\TestJob","command":"O:16:\"App\\Jobs\\TestJob\":2:{s:9:\"cacheUuid\";s:36:\"d84bfeca-dbb4-4b23-8bce-7cc989198b2d\";s:5:\"queue\";s:7:\"default\";}"},"id":"d952eb75-5b82-42dd-b033-6730e2b8297c","attempts":0,"type":"job","tags":[],"pushedAt":"1674488333.7774"}
Below a simplified docker compose to run against dev dragonfly on host (outside of docker compose):
with docker compose -f ./docker-compose-lavarel.yml up
on my machine it crashes after a few seconds:
F20230123 18:01:10.395542 1857614 transaction.cc:1113] Check failed: trans == this
*** Check failure stack trace: ***
@ 0x55a94ed8ab3d google::LogMessage::Fail()
@ 0x55a94ed8aa84 google::LogMessage::SendToLog()
@ 0x55a94ed8a27a google::LogMessage::Flush()
@ 0x55a94ed8df5c google::LogMessageFatal::~LogMessageFatal()
@ 0x55a94e7fb7ac dfly::Transaction::UnlockMultiShardCb()
version: '3.8'
networks:
app_test_network:
name: app_test_network
services:
app_laravel:
build:
context: src
dockerfile: Dockerfile
restart: always
image: fannyfan/laravel-redis-dragonfly:app_v1
command: ["./init-octane.sh"]
ports:
- 8000:8000
environment:
- HORIZON_FAST_WORKERS_COUNT=5
- HORIZON_SLOW_WORKERS_COUNT=5
- OCTANE_WORKERS_COUNT=5
- OCTANE_TASK_WORKERS_COUNT=1
- OCTANE_PORT=8000
- OCTANE_HOST=app
- REDIS_PASSWORD=
- REDIS_HOST=host.docker.internal
- REDIS_PORT=6379
networks:
- app_test_network
extra_hosts:
- 'host.docker.internal:host-gateway'
deploy:
mode: replicated
replicas: 1
placement:
max_replicas_per_node: 1
app_laravel_horizon:
image: fannyfan/laravel-redis-dragonfly:app_v1
command: ["./init-horizon.sh"]
restart: always
environment:
- HORIZON_FAST_WORKERS_COUNT=5
- HORIZON_SLOW_WORKERS_COUNT=5
- OCTANE_WORKERS_COUNT=5
- OCTANE_TASK_WORKERS_COUNT=1
- OCTANE_PORT=8000
- OCTANE_HOST=app
- REDIS_HOST=host.docker.internal
- REDIS_PASSWORD=
- REDIS_PORT=6379
networks:
- app_test_network
extra_hosts:
- 'host.docker.internal:host-gateway'
deploy:
mode: replicated
replicas: 2
placement:
max_replicas_per_node: 1
ok, I succeed to reproduce it in a unit test. below is the log of EvalBug713
run (branch Bug713).
@dranikpg feel free to continue investigating tomorrow.
I20230123 20:59:37.582479 1876070 proactor_pool.cc:66] Running 4 io threads
I20230123 20:59:37.588579 1876070 server_family.cc:378] Data directory is "/home/roman/project/dragonfly/build-dbg"
I20230123 20:59:37.588810 1876070 test_utils.cc:159] Starting EvalBug713
I20230123 20:59:37.589547 1876071 transaction.cc:247] InitByArgs EVAL@0/3 (1072) laravel_database_queues:slow
I20230123 20:59:37.590436 1876072 transaction.cc:907] Lock granted 1 for trans EVAL@1/3 (1072)
I20230123 20:59:37.590440 1876073 transaction.cc:907] Lock granted 1 for trans EVAL@1/3 (1072)
I20230123 20:59:37.590500 1876072 transaction.cc:938] Insert into tx-queue, sid(1) EVAL@1/3 (1072), qlen 1
I20230123 20:59:37.590507 1876073 transaction.cc:938] Insert into tx-queue, sid(2) EVAL@1/3 (1072), qlen 1
I20230123 20:59:37.590456 1876071 transaction.cc:907] Lock granted 1 for trans EVAL@1/3 (1072)
I20230123 20:59:37.590549 1876071 transaction.cc:938] Insert into tx-queue, sid(0) EVAL@1/3 (1072), qlen 1
I20230123 20:59:37.590631 1876071 transaction.cc:484] Scheduled EVAL@1/3 (1072) OutOfOrder: 0 num_shards: 3
I20230123 20:59:37.590915 1876071 transaction.cc:247] InitByArgs ZRANGEBYSCORE@1/1 (1072) laravel_database_queues:slow:reserved
I20230123 20:59:37.591018 1876071 transaction.cc:689] ExecuteAsync ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.591082 1876071 transaction.cc:604] ScheduleSingleHop before Wait ZRANGEBYSCORE@1/1 (1072) 1
I20230123 20:59:37.591130 1876072 transaction.cc:734] PollExecCb ZRANGEBYSCORE@1/1 (1072) sid(1) 1, should_poll: 1
I20230123 20:59:37.591183 1876072 transaction.cc:315] RunInShard: ZRANGEBYSCORE@1/1 (1072) sid:1
I20230123 20:59:37.591243 1876072 transaction.cc:749] ptr_release ZRANGEBYSCORE@1/1 (1072) 0
I20230123 20:59:37.591279 1876071 transaction.cc:606] ScheduleSingleHop after Wait ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.591548 1876071 transaction.cc:615] UnlockMulti ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.591701 1876073 transaction.cc:1108] unlockmulti: TxPopFront ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.591701 1876071 transaction.cc:1108] unlockmulti: TxPopFront ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.591804 1876071 transaction.cc:644] UnlockMultiEnd ZRANGEBYSCORE@1/1 (1072)
I20230123 20:59:37.592043 1876071 transaction.cc:58] Transaction ZRANGEBYSCORE@1/1) destroyed
I20230123 20:59:37.592589 1876072 transaction.cc:247] InitByArgs EVAL@0/3 (29) laravel_database_queues:slow
I20230123 20:59:37.592690 1876071 transaction.cc:247] InitByArgs EVAL@0/2 (1072) laravel_database_queues:default:delayed
I20230123 20:59:37.592923 1876073 transaction.cc:907] Lock granted 1 for trans EVAL@2/2 (1072)
I20230123 20:59:37.592962 1876073 transaction.cc:938] Insert into tx-queue, sid(2) EVAL@2/2 (1072), qlen 1
I20230123 20:59:37.592927 1876071 transaction.cc:907] Lock granted 1 for trans EVAL@2/2 (1072)
I20230123 20:59:37.592972 1876072 transaction.cc:907] Lock granted 1 for trans EVAL@3/3 (29)
I20230123 20:59:37.592999 1876073 transaction.cc:907] Lock granted 1 for trans EVAL@3/3 (29)
I20230123 20:59:37.593005 1876071 transaction.cc:938] Insert into tx-queue, sid(0) EVAL@2/2 (1072), qlen 1
I20230123 20:59:37.593022 1876073 transaction.cc:938] Insert into tx-queue, sid(2) EVAL@3/3 (29), qlen 2
I20230123 20:59:37.593008 1876072 transaction.cc:938] Insert into tx-queue, sid(1) EVAL@3/3 (29), qlen 1
I20230123 20:59:37.593102 1876071 transaction.cc:907] Lock granted 1 for trans EVAL@3/3 (29)
I20230123 20:59:37.593148 1876071 transaction.cc:938] Insert into tx-queue, sid(0) EVAL@3/3 (29), qlen 2
I20230123 20:59:37.593206 1876072 transaction.cc:484] Scheduled EVAL@3/3 (29) OutOfOrder: 0 num_shards: 3
I20230123 20:59:37.593219 1876071 transaction.cc:484] Scheduled EVAL@2/2 (1072) OutOfOrder: 0 num_shards: 2
I20230123 20:59:37.593312 1876072 transaction.cc:247] InitByArgs ZRANGEBYSCORE@3/1 (29) laravel_database_queues:slow:reserved
I20230123 20:59:37.593356 1876072 transaction.cc:689] ExecuteAsync ZRANGEBYSCORE@3/1 (29)
I20230123 20:59:37.593374 1876071 transaction.cc:247] InitByArgs ZRANGEBYSCORE@2/1 (1072) laravel_database_queues:default:delayed
I20230123 20:59:37.593426 1876072 transaction.cc:604] ScheduleSingleHop before Wait ZRANGEBYSCORE@3/1 (29) 1
I20230123 20:59:37.593447 1876071 transaction.cc:689] ExecuteAsync ZRANGEBYSCORE@2/1 (1072)
I20230123 20:59:37.593459 1876072 transaction.cc:734] PollExecCb ZRANGEBYSCORE@3/1 (29) sid(1) 1, should_poll: 1
I20230123 20:59:37.593495 1876072 transaction.cc:315] RunInShard: ZRANGEBYSCORE@3/1 (29) sid:1
I20230123 20:59:37.593499 1876071 transaction.cc:604] ScheduleSingleHop before Wait ZRANGEBYSCORE@2/1 (1072) 1
I20230123 20:59:37.593534 1876072 transaction.cc:749] ptr_release ZRANGEBYSCORE@3/1 (29) 0
I20230123 20:59:37.593561 1876072 transaction.cc:606] ScheduleSingleHop after Wait ZRANGEBYSCORE@3/1 (29)
I20230123 20:59:37.593551 1876071 transaction.cc:734] PollExecCb ZRANGEBYSCORE@2/1 (1072) sid(0) 1, should_poll: 1
I20230123 20:59:37.593591 1876071 transaction.cc:315] RunInShard: ZRANGEBYSCORE@2/1 (1072) sid:0
I20230123 20:59:37.593626 1876072 transaction.cc:615] UnlockMulti ZRANGEBYSCORE@3/1 (29)
I20230123 20:59:37.593662 1876071 transaction.cc:749] ptr_release ZRANGEBYSCORE@2/1 (1072) 0
I20230123 20:59:37.593716 1876073 transaction.cc:1108] unlockmulti: TxPopFront ZRANGEBYSCORE@3/1 (29)
I20230123 20:59:37.593726 1876071 transaction.cc:1108] unlockmulti: TxPopFront ZRANGEBYSCORE@3/1 (29)
F20230123 20:59:37.593748 1876073 transaction.cc:1113] Check failed: trans == this Front: ZRANGEBYSCORE@2/1 (1072)
@fannyfan414
I've run plow on the latest changes and it seems to fully pass now. Thanks a lot for your insights
@dranikpg thanks! Is this fix in 'ghcr.io/dragonflydb/dragonfly:alpha-ubuntu'?
Not yet!
On Tue, Jan 24, 2023 at 2:52 PM fannyfan414 @.***> wrote:
@fannyfan414 https://github.com/fannyfan414 thanks! Is this fix in 'ghcr.io/dragonflydb/dragonfly:alpha-ubuntu'?
— Reply to this email directly, view it on GitHub https://github.com/dragonflydb/dragonfly/issues/713#issuecomment-1401897683, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4BFCCB7DNK7WGPMX3CEPDWT7F7NANCNFSM6AAAAAAUBUIEAI . You are receiving this because you modified the open/close state.Message ID: @.***>
Describe the bug Hey! I created a simple application with laravel horizon and try to use it with dragonflydb as queue-driver. Currently it's working with redis 6.2.6 or keydb 6.3.1.
I have 2 containers -
app_laravel
andapp_laravel_horizon
. First is simple api, with GET route/api/test
that pushTestJob
job to queue and thenapp_laravel_horizon
workers process this job.To Reproduce
docker network create app_test_network
app_laravel
,app_laravel_horizon
, andredis_master
with passwordredistestpassword
docker compose -f ./docker-compose.yml up
Expected behavior
http://127.0.0.1:8000/horizon
and check that horizon status is Active.http://127.0.0.1:8000/api/test
or run commanddocker run --rm --net=app_test_network ghcr.io/six-ddc/plow http://app_laravel:8000/api/test -c 1 -d 1s
http://127.0.0.1:8000/horizon/jobs/completed
and see that jobs are completed.redis-cli -a redistestpassword
andzrange laravel_horizon:completed_jobs 0 -1
to see completed jobs.dragonfly
all jobs for some reasong hang up inpending
state and never started..Screenshots Using
redis
:Using
dragonflydb
:Environment (please complete the following information):