Closed FFKL closed 1 year ago
Thanks for the repro repo, it was really helpful to see what's happening under the hood.
Unfortunately, I don't think that it can be fixed on moleculer side. I'll try to explain what happens
main
node connects to Redis and calls the worker
worker
sends the "giant message" via send method https://github.com/moleculerjs/moleculer/blob/b48ec4f3d39951ad433a64c4c36ecdc2af824c24/src/transporters/redis.js#L133-L140worker
node. So on worker
side everything is good.output buffer
limit then Redis will close connection of all subscribers/consumersmain
node reconnects to Redis but it doesn't know anything about the "giant message". Since the message is larger than the output buffer
it won't be sent anywhere. Because of this the main
node gets stuck forever.The main issue is that Redis does not inform the client about the reason for closing the connection. The reason is only logged on the server-side. So even if we "break" the await mainBroker.call("worker.getBigMessage")
we don't know the exact cause. What would be the error that moleculer should throw?
I've searched for the scheduled to be closed ASAP for overcoming of output buffer limits
. Here's what I found https://gist.github.com/amgorb/c38a77599c1f30e853cc9babf596d634 The solution is basically to increase the output buffer size.
@AndreMaz Thank you for the explanation. It took me a few hours to figure out that this is a Redis failure so I decided to open this issue;) I thought it might be possible to fix this on the moleculer
side due to Promise
memory leak. Can we keep track of such pending Promises and show a warning message or configure Transporter
to reject these Promises?
My solution at the moment:
requestTimeout
And the main problem is that in my case it is a heisenbug so Redis has also time limit - a soft limit of 32 megabytes per 10 seconds means that if the client has an output buffer bigger than 32 megabytes for, continuously, 10 seconds, the connection gets closed.
. The "giant message" sometimes can be successfully delivered.
It took me a few hours to figure out that this is a Redis failure so I decided to open this issue;)
Yeah, it's definitely a tricky situation. Without your repro repo it would have been really difficult to find the issue.
Can we keep track of such pending Promises and show a warning message or configure Transporter to reject these Promises?
We could track a pending Promise by adding a timeout but that is already done with requestTimeout
. Do you have any suggestions?
Can you use streams in this case?
@AndreMaz Unfortunately, transporter sends all chunks to Redis one by one. In this case, nothing changes. I've added the example to the repo.
I think, requestTimeout
is not the same as the potential cutAllPendingRequests
option (it does not reject connection instantly on close
event), but this is the only working solution right now.
I'm closing this issue because we don't know, how we can solve it inside Moleculer. If you have an idea or solution please reopen or open a PR.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Current Behavior
Broker.call
waits forever if Redis transporter reconnects while transmitting a message. Of course, we haverequestTimeout
, but I don't think this is the correct behavior.Expected Behavior
Returned
Promise
must be rejected on connection close.Failure Information
Example here: https://github.com/FFKL/moleculer-redis-freeze
Steps to Reproduce
Please provide detailed steps for reproducing the issue.
subscribe scheduled to be closed ASAP for overcoming of output buffer limits.
messagePromise
will always wait.Reproduce code snippet
Context
Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.
Failure Logs