Closed rahulKrishnaM closed 1 year ago
This relates with #19 and the conditions haven't changed since. I guess you have a pretty low connect timeout already?
Ya connect-timeout is set to 2seconds currently.
So, a blocking connect() call will effectively see a max wait time of connect_timeout value, right? And, if a poll() follows a connect(), i guess there too we will see a max of connect_timeout wait time. So, per redis server-ip there could be a 2*connect_timeout wait time in an error case.
I'm not following you regarding the following poll()
? The poll() in hiredis is the one that handles timeout I though.
I guess the amount of outstanding callbacks add-on time, depending on max-retry-count. If max-retry-count is higher then fewer callbacks triggers a config-get, so that might be a remedy..
I'm not following you regarding the following
poll()
? The poll() in hiredis is the one that handles timeout I though.
what I meant was this piece of code, which follows after a failed connect() call. https://github.com/redis/hiredis/blob/master/net.c#L266
I'm not following you regarding the following
poll()
? The poll() in hiredis is the one that handles timeout I though.what I meant was this piece of code, which follows after a failed connect() call. https://github.com/redis/hiredis/blob/master/net.c#L266
Hope I made this clear @bjosv
Update: I'm currently trying out a change for using async reconnection attempts. Hopefully this will fix this problem, but I need to try it out a bit more.
Thanks for the update @bjosv . Making the reconnection attempts async would really help in improving this wait period. Shall wait for the changes to retry the scenario.
A change covering this issue is now delivered, hope it fixes the problems in your setup as well.
Nice! Thanks, bjosv, shall try pulling this and see how it goes.
Hi,
Encountered a case wherein the processing seems to be stuck at hiredis-cluster library and not returning to application for more than 60seconds. There is a mechanism we have in place to check if a thread is taking longer to report a heartbeat, which is getting missed in this case, and results in an exception being thrown.
This is the backtrace from the flow:
From what I could gather from the backtrace and frames, this is the flow of events:
I assume the analysis above holds good here. If this is the case, is there a way or suggestion to improve this? Do you see any other way of doing this that it doesn't get stuck at hiredis-cluster in processing.
@bjosv