Closed jeremyowensboggs closed 2 years ago
Somehow you got a job into Faktory which has no retry
attribute at all. Both PUSH
and PUSHB
add retry
if it's not there so I have no idea how this could happen.
Is there a way we can clear the job out?
Yes. If you start Redis by pointing it to the datafile, you can fire up redis-cli. One of the entries in the working
zset is the bad job. You'll want to ZREM the entry which does not have a retry attribute.
redis-server faktory-redis.conf --path /path/to/faktory/db
Here's the faktory-redis.conf:
Make sure you remove all entries without retry
.
Not seeing any with no retry, but we have a few with a retry of -1. Would a retry of -1 cause this?
nm, there is one with null "retry":null
The job with the null retry is created is the result of on on_success batch. However, it runs every 20 minutes since Monday of last week, and has succeeded quite a few times in the past week without this problem occurring.
Yes, that's a bug that has been fixed but not released. I will release 1.6.2 this week. In the meantime, try to explicitly set "retry" if possible in your client code where you define the callback.
Faktory Enterprise 1.6.1 linux/amd64 © 2022 Contributed Systems LLC. I 2022-10-03T14:41:59.701Z Licensed to Pepsico, max 100 connections
I 2022-10-03T14:41:59.701Z Initializing redis storage at /var/lib/faktory/db, socket /var/lib/faktory/db/redis.sock
I 2022-10-03T14:41:59.714Z Web server now listening at :7420
I 2022-10-03T14:41:59.715Z Sending statsd metrics to 10.7.200.90:8125 with namespace simple-machine
I 2022-10-03T14:41:59.717Z PID 1 listening at :7419, press Ctrl-C to stop
I 2022-10-03T14:42:00.715Z Dead processed 2 jobs
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6c6d71]
goroutine 18 [running]:
github.com/contribsys/faktory/manager.(manager).processFailure(0xc00013e0c0, {0xc00002e228, 0x18}, 0xb388e0)
/Users/mperham/src/github.com/contribsys/faktory/manager/retry.go:120 +0x251
github.com/contribsys/faktory/manager.(manager).ReapExpiredJobs.func1({0xc0005a6300, 0x2f3, 0x300})
/Users/mperham/src/github.com/contribsys/faktory/manager/working.go:222 +0x3cf
github.com/contribsys/faktory/storage.(redisSorted).RemoveBefore(0xc000116150, {0xc000266080?, 0xb405c0?}, 0xa, 0xc0005a2060)
/Users/mperham/src/github.com/contribsys/faktory/storage/sorted_redis.go:288 +0x31a
github.com/contribsys/faktory/manager.(manager).ReapExpiredJobs(0xc00013e0c0, {0x0?, 0xc00032ce08?, 0xb405c0?})
/Users/mperham/src/github.com/contribsys/faktory/manager/working.go:193 +0x135
github.com/contribsys/faktory/server.(reservationReaper).Execute(0xc000117d88)
/Users/mperham/src/github.com/contribsys/faktory/server/tasks.go:20 +0x45
github.com/contribsys/faktory/server.(taskRunner).cycle(0xc0001183c0)
/Users/mperham/src/github.com/contribsys/faktory/server/task_runner.go:99 +0x1e5
github.com/contribsys/faktory/server.(taskRunner).Run.func1()
/Users/mperham/src/github.com/contribsys/faktory/server/task_runner.go:65 +0xa7
created by github.com/contribsys/faktory/server.(taskRunner).Run
/Users/mperham/src/github.com/contribsys/faktory/server/task_runner.go:58 +0x72
Are you using an old version? Yes
Have you checked the changelogs to see if your issue has been fixed in a later version? N/A
https://github.com/contribsys/faktory/blob/master/Changes.md https://github.com/contribsys/faktory/blob/master/Pro-Changes.md https://github.com/contribsys/faktory/blob/master/Ent-Changes.md