Open dbas-dn opened 2 months ago
any reaction?
waitUntilFinished
is not a recommended api to use in production, it does not scale and is not a proper way to design an architecture based on queues: https://blog.taskforce.sh/do-not-wait-for-your-jobs-to-complete/
@manast I am facing the same. I am developing a realtime AI and Image convertion tool. The user use Server Sent Event
to push some image to my server, and i push to queue to process using third api(it has QPS) to convert and return to user until the job finished.
I think the waitUntilFinished
is very useful when using Server Sent Event
.
@daimalou it may be useful but it is not the proper way to use queues. As I see it, you would be better off just spawning a NodeJS worker thread, run the job and wait it for completion, than using a queue.
@daimalou btw, SSE are used for sending data from the server to the client, not sure what you mean that you use it for sending images to the server :/. In any case, if you use SSE or web sockets, it does not matter, you can easily communicate to the client when a job has completed without relying on waitUntilFinished
, you may need to redesign your solution a bit, but thats the proper way to do it and have a scalable and issue free system that run stable for a long time.
@manast yes, my description was a bit unclear. i use https://github.com/Azure/fetch-event-source, it can post some date to server and server response data using SSE. Thank you. I think what you said is correct. I need to create more logic myself.
Version
v5.12.12
Platform
NodeJS
What happened?
Environment:
Kubernetes: k3s with 3 nodes Redis: Sentinel configuration with 3 nodes Description:
Our system is designed to create a large number of queues. We noticed that while it's possible to pass an already established Redis connection for creating Queue, Job, and Worker instances, the QueueEvents class duplicates the passed connection using duplicate().
During our load tests, where we ran between 100-600 queues, each queue created jobs that returned results. We observed that under low load, waitUntilFinished worked as expected, returning the job result. However, under high load conditions, waitUntilFinished failed to return, hanging indefinitely until the TTL expired.
To debug this, I implemented a parallel polling mechanism that ran the scripts.isFinished script, which indicated that the job had indeed reached the completed status and I could retrieve its result, even though waitUntilFinished remained stuck.
Additionally, the excessive number of connections created by QueueEvents due to the duplicate() method led to us exceeding the maximum number of Redis connections, especially under high concurrency.
Workaround:
As a temporary workaround, I replaced waitUntilFinished with periodic execution of scripts.isFinished to check job completion.
Issue Summary:
waitUntilFinished does not return in high-load scenarios, even when the job is completed according to scripts.isFinished. QueueEvents creates redundant connections by duplicating the Redis connection, which leads to exceeding the maximum number of connections under high concurrency.
How to reproduce.
Relevant log output
Code of Conduct