the man page for setsockopt says that EDOM happens for -ve timeval:
SO_RCVTIMEO is an option to set a timeout value for input operations. It
accepts a struct timeval parameter with the number of seconds and
microseconds used to limit waits for input operations to complete. In
the current implementation, this timer is restarted each time additional
data are received by the protocol, and thus the limit is in effect an
inactivity timer. If a receive operation has been blocked for this much
time without receiving additional data, it returns with a short count or
with the error EWOULDBLOCK if no data were received. The struct timeval
parameter must represent a positive time interval; otherwise,
setsockopt() returns with the error EDOM.
Looking at the libmemcached C code, in memcached_connect.c, we noticed that rcv_timeout was set as follows:
which means that timeval 'waittime' has a invalid value when rcv_timeout >= 1 sec. This is a good example of why you should check the return status from a system call and not doing so means that you silently ignore error :)
Note that this problem does not happen with connect_timeout because it is used in poll() which expects time in msec
tl; dr; libmemcached connections hang infinitely when rcv_timeout >= 1 million usec (or >= 1 sec)
Details:
When :timeout or :recv_timeout options are set to >= 1 sec, the strace reveals the following:
the man page for setsockopt says that EDOM happens for -ve timeval:
Looking at the libmemcached C code, in memcached_connect.c, we noticed that rcv_timeout was set as follows:
216 if (ptr->root->rcv_timeout) 217 { 218 int error; 219 struct timeval waittime; 220 221 waittime.tv_sec= 0; 222 waittime.tv_usec= ptr->root->rcv_timeout; 223 224 error= setsockopt(ptr->fd, SOL_SOCKET, SO_RCVTIMEO, 225 &waittime, (socklen_t)sizeof(struct timeval)); 226 WATCHPOINT_ASSERT(error == 0); 228 }
which means that timeval 'waittime' has a invalid value when rcv_timeout >= 1 sec. This is a good example of why you should check the return status from a system call and not doing so means that you silently ignore error :)
Note that this problem does not happen with connect_timeout because it is used in poll() which expects time in msec
41 while (ptr->fd != -1 && 242 connect(ptr->fd, use->ai_addr, use->ai_addrlen) < 0) 243 { 244 ptr->cachederrno= errno; 245 if (errno == EINPROGRESS || /* nonblocking mode - first return, / 246 errno == EALREADY) /_ nonblocking mode - subsequent returns */ 247 { 248 struct pollfd fds[1]; 249 fds[0].fd = ptr->fd; 250 fds[0].events = POLLOUT; 251 int error= poll(fds, 1, ptr->root->connect_timeout); 252 253 if (error != 1 || fds[0].revents & POLLERR) 254 {
The fix for rcv_timeout is:
Similar fix for snd_timeout is