I'm currently on BullMQ Pro 7.6.2 with ioredis 5.4.1.
Issue
I have configured a queue with leaky-bucket limiter (i.e. 1 job per X ms). However, I am running into a problem where:
I have many jobs in the queue, and jobs are dequeued at the correct rate limit
A missing lock for job X error occurs
After the error, the queue rate limit is no longer obeyed, and all remaining jobs in the queue are dequeued ASAP.
As far as I can tell, it looks like this happens whenever there is a job that takes longer than the limiter duration (in my case, I am sometimes making a HTTP request to slow REST APIs in my job).
Example
I have 10 jobs in my queue, and worker limiter is set to 1 job per 500ms
Job 2 makes a HTTP request that takes 1 second.
A missing lock for job 2 error occurs
Job 3, 4, 5, ... , 10 are all immediately dequeued and processed.
Would this be a bug? It is correct for job 3 to be dequeued immediately (since job 2 took longer than the rate limit duration), but jobs 4, 5, ... , 10 should not be dequeued until the next rate limit window.
Minimal repro
import { WorkerPro, QueuePro } from '@taskforcesh/bullmq-pro'
import ioredis from 'ioredis'
interface JobData {
idx: number
}
const sleep = (ms: number) => new Promise((resolve) => setTimeout(resolve, ms))
const connection = new ioredis({
host: 'localhost',
port: 6379,
maxRetriesPerRequest: null,
})
const queue = new QueuePro<JobData>('{test-queue}', {
connection,
})
new WorkerPro<JobData>(
'{test-queue}',
async (job) => {
console.log(
`[${new Date().toTimeString()}] Starting job ${job.data.idx}...`,
)
// Hypothesis: problem occurs if at least one job has processing time > limiter duration?
// This is simulating a slow http request using setTimeout
if (job.data.idx === 2) {
await sleep(4000)
}
console.log(
`[${new Date().toTimeString()}] Finished job ${job.data.idx}...`,
)
},
{
connection,
limiter: {
max: 1,
duration: 3000, // 3 seconds for easy verification
},
},
)
for (let jobIdx = 0; jobIdx < 10; jobIdx++) {
queue.add(`job-${jobIdx}`, {
idx: jobIdx,
})
}
This is the output:
[10:44:07 GMT+0800 (Singapore Standard Time)] Starting job 0...
[10:44:07 GMT+0800 (Singapore Standard Time)] Finished job 0...
[10:44:10 GMT+0800 (Singapore Standard Time)] Starting job 1...
[10:44:10 GMT+0800 (Singapore Standard Time)] Finished job 1...
[10:44:13 GMT+0800 (Singapore Standard Time)] Starting job 2...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 2...
Error: Missing lock for job 3. moveToFinished
at ScriptsPro.finishedErrors (/Users/kuanweeloong/bullmq-repro/node_modules/bullmq/src/classes/scripts.ts:459:16)
at ScriptsPro.finishedErrors (/Users/kuanweeloong/bullmq-repro/node_modules/@taskforcesh/bullmq-pro/src/classes/scripts-pro.ts:701:22)
at JobPro.moveToFailed (/Users/kuanweeloong/bullmq-repro/node_modules/bullmq/src/classes/job.ts:730:26)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at async handleFailed (/Users/kuanweeloong/bullmq-repro/node_modules/bullmq/src/classes/worker.ts:754:11)
at async WorkerPro.retryIfFailed (/Users/kuanweeloong/bullmq-repro/node_modules/bullmq/src/classes/worker.ts:977:16)
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 4...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 4...
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 5...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 5...
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 6...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 6...
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 7...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 7...
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 8...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 8...
[10:44:17 GMT+0800 (Singapore Standard Time)] Starting job 9...
[10:44:17 GMT+0800 (Singapore Standard Time)] Finished job 9...
[10:45:07 GMT+0800 (Singapore Standard Time)] Starting job 3...
[10:45:07 GMT+0800 (Singapore Standard Time)] Finished job 3...
As you can see, jobs 4 to 9 are all dequeued immediately, but I expected job 5 to be dequeued at 10:44:20, job 6 to be dequeued at 10:44:23, etc.
Hi all,
I'm currently on BullMQ Pro 7.6.2 with ioredis 5.4.1.
Issue
I have configured a queue with leaky-bucket
limiter
(i.e. 1 job per X ms). However, I am running into a problem where:missing lock for job X
error occursAs far as I can tell, it looks like this happens whenever there is a job that takes longer than the limiter duration (in my case, I am sometimes making a HTTP request to slow REST APIs in my job).
Example
limiter
is set to 1 job per 500msmissing lock for job 2
error occursWould this be a bug? It is correct for job 3 to be dequeued immediately (since job 2 took longer than the rate limit duration), but jobs 4, 5, ... , 10 should not be dequeued until the next rate limit window.
Minimal repro
This is the output:
As you can see, jobs 4 to 9 are all dequeued immediately, but I expected job 5 to be dequeued at 10:44:20, job 6 to be dequeued at 10:44:23, etc.