Open lukepolo opened 3 months ago
But are these issues with IORedis or specific to BullMQ? because we really rely on IORedis capability to reconnect automatically. Recently I found an issue where the blocking BZPOPMIN command could hang forever in the case of a disconnect, but other than that it should just work.
I’ve found that basically it’s both IORedia and BullMq that have issues on reconnecting , my test script shows IORedis has an issue where I have to force a reconnect command . But bull has issues if redis is completely restarted.
I’ll post my testing processes later today , as there’s 3 different type of failures .
Another note , bull does reconnect , but jobs get stuck in waiting forever
There was some strange issue with IORedis (https://github.com/redis/ioredis/issues/1888) but I made a fix in BullMQ to workaround it and since then I cannot reproduce it anymore.
Is your feature request related to a problem? Please describe. Sometimes redis has different sorts of disconnects that may not result in bullmq not to reconnect. Mostly because my config requires us to make sure a job aways gets added / ran .
For instance if a DNS resolution failure i have to setup a check that we resume the queue after getting the disconnect
Other times redis has crashed and if your using in memory only, there will be no queue, to fix that we needed to setup a ping pong system to force a reconnect of redis. BullMQ seems to pick that up better and is able to do what it does and continues to work after a connect.
Describe the solution you'd like It would be nice if bullmq was able to detect these the same way to handle the reconects / resumes of the worker / producer etc.
Additional context Most of my code is closed source, but im willing to share some code if needed.
Feel free to close the issue but i wanted to warn people that the bullmq reconnect may or may not work depending on their configuration of errors.
Also note , we run our redis in a kubernetes cluster so our DNS resolution can fail (annoying) , or can even change between different IP's.
Here is example code to handle the disconnects via redis crashing