Open doramar97 opened 1 year ago
We found that this code https://github.com/mike-marcacci/node-redlock/blob/main/src/index.ts#L436 while loop creates an infinite loop and crashes the server. We found that there is a mem leak. Interestingly it’s caused when there are parallel requests going on. Somehow the redlock can't cope with that and gets stuck in the infinite loop.
This issue appears to be occurring to us as well in our AWS Lambda executions. We had an API request that was taking longer than expected during peak times and as a result it was out lasting the specified redlock lock time. When we went to extend the lock after the API request, the Lambda mysteriously imploded with an "UnknownApplicationError" that was not getting caught in our error handling block. It looks like this issue is what was happening.
We are using 1 Node of Elasticache for Redis on our Production environment. Engine version - 6.2.6, Node type - cache.t3.micro.
The environment is implemented on EKS, consumers are pods on the cluster and each pod handles one task at a time. (We also use AmazonMQ - Rabbit for handling tasks).
We are using redlock to lock a process if another process is already running using the customer - which means as long as a task running on the specific customer is executing, no other task regarding the specific customer can be executed and goes to a another queue that handles delayed messages .
Our issue is with long running tasks or multiple tasks addressing the same customer. Getting the following errors which causes the pod to restart.
Will be happy to provide some more context or code, we are also setting
lockDuration: number = 2000
in a function that checks if a block is locked.Will be happy to get any kind of help and guidance towards this issue, or the best practices to our use case, Thanks !