We've been logging intermittent connection failures for a while. It took a while to work out that it was probably due to local (ephemeral) port exhaustion. This causes connect() to fail with EADDRINUSE. We can easily reproduce connection failures in the libmemcached under realistic connection rates. On Linux, the error will occur when there are more than about 28232 connections from a single client host to a single server in a 60 second period (the TIME_WAIT expiry).
In libmemcached's network_connect(), EADDRINUSE is handled by the "default" case, so just gives MEMCACHED_CONNECTION_FAILURE with no other details. It would be nice if more information could be given, for the purposes of logging. MEMCACHED_CONNECTION_FAILURE could be split, or memcached_last_error_message() could be documented as a public API and populated with some errno-specific error message in the event of connect() failure.
It would be nice if any errno was handled, since EACCES, ENETUNREACH and ENOMEM are probably also possible.
Also, according to Linux's man connect(2), EAGAIN indicates "no more free local ports or insufficient entries in the routing cache", so you probably shouldn't call poll() on the FD if that happens.
Imported from Launchpad using lp2gh.
We've been logging intermittent connection failures for a while. It took a while to work out that it was probably due to local (ephemeral) port exhaustion. This causes connect() to fail with EADDRINUSE. We can easily reproduce connection failures in the libmemcached under realistic connection rates. On Linux, the error will occur when there are more than about 28232 connections from a single client host to a single server in a 60 second period (the TIME_WAIT expiry).
In libmemcached's network_connect(), EADDRINUSE is handled by the "default" case, so just gives MEMCACHED_CONNECTION_FAILURE with no other details. It would be nice if more information could be given, for the purposes of logging. MEMCACHED_CONNECTION_FAILURE could be split, or memcached_last_error_message() could be documented as a public API and populated with some errno-specific error message in the event of connect() failure.
It would be nice if any errno was handled, since EACCES, ENETUNREACH and ENOMEM are probably also possible.
Also, according to Linux's man connect(2), EAGAIN indicates "no more free local ports or insufficient entries in the routing cache", so you probably shouldn't call poll() on the FD if that happens.