Socket timeout violated on undiscoverable hosts

picanumber commented 2 years ago

Describe the bug

When the server host specified to the connection options is a non reachable machine, requests take ~30 seconds to throw an exception.

To Reproduce

Create a redis instance specifying an unreachable machine in the connection options. In my case the specified machine was switched off:

sw::redis::ConnectionOptions opt;

opt.host = "t10@mydomain.net";
opt.socket_timeout = 500ms;

Any request violates the socket timeout

redis.ping(); // This takes about ~30 seconds

Expected behavior

At most socket_timeout should go by before throwing a timeout exception. Ideally the lack of connection would immediately throw a no connection exception, like it happens for discoverable hosts.

Environment:

OS: ubuntu 20.04.2
Compiler: gcc 10
hiredis version: master
redis-plus-plus version: 1.3.3

Additional context

Let me know if there's a related ticket to amend this, in case I can contribute

sewenew commented 2 years ago

There're two timeout options, ConnectionOptions::connect_timeout and ConnectionOptions::socket_timeout. The former is used for connecting to Redis, and the latter is used for sending commands. Since your Redis server is down, redis-plus-plus cannot connect to it, you also need to set ConnectionOptions::connect_timeout.

However, your case is more complicated, because you use domain name (dns), instead of IP. Client needs to resolve the domain name to IP before connecting to server. The getaddrinfo or gethostbyname API, which is used for domain resolve, might take as long as 30 seconds if the dns resolve fails. Also these API are synchronous, and have no way to set a timeout. Check this for details and how to reduce the timeout.

Unfortunately, I don't think there's too much work we can do to completely solve the problem. Also even if there's a work around for it, we'd better fix the problem in hiredis' code base, since the connecting work is done by hiredis, and redis-plus-plus is just a wrapper for hiredis.

Regards

picanumber commented 2 years ago

Thank you for the reply, that's most helpful. Basically I was thinking it could be possible to pay connect_timeout overhead only for the first failed request and the following ones can immediately throw an exception, since the connection pool can be aware if it finished its job. But it's tricky to implement I'll close this ticket and ping you with an MR if I have a suggestion.

sewenew / redis-plus-plus

Socket timeout violated on undiscoverable hosts #322