twitter / twemproxy

A fast, light-weight proxy for memcached and redis
Apache License 2.0
12.15k stars 2.06k forks source link

Don't hang for dns error responding to fragmented requests #597

Closed TysonAndre closed 3 years ago

TysonAndre commented 3 years ago

If hostnames are used instead of ip addresses for all hosts within a pool, and dns lookup fails, then get/multiget will hang indefinitely instead of responding with an error such as SERVER_ERROR Host is down. (I expect a client would detect this and close the connection, but this is not ideal, the client timeout could be a second)

I suspect that's because memcached get is implemented to coalesce responses from multiple backend servers, even when there's only one key, and this is likely a bug specific to handling coalescing when there's no attempt to send the request to a backend server

Because connection attempts are async but dns lookup is asynchronous, there's a non-null server connection for a host that's unavailable but a null server connection for a host that has a dns lookup error, and these end up using different code paths. (e.g. the former will call server_close() which does adjust nfrag_sent)

Fixes #596

Solution

Increment the fragmented messages processed (for the associated request) in the case that no server connection was established for a given key, as well. This will ensure that the request completes and the connection from the client is properly aborted instead of hanging when there is a dns error preventing the creation of a connection.