Closed GoogleCodeExporter closed 9 years ago
The root cause of this issue is try_read_udp() never reset c->rbytes.
The processing of conn_nread minus c->rbytes by c->rlbytes, so if extra bytes
exists, they will be 'left' in c->rbytes. However, each time when recvfrom()s
UDP socket, the whole read buffer is overwritten, so 'c->rbytes += res' sets
c->rbytes incorrectly.
After the binary 'GET' command fail, the state of connection is transferred to
'conn_new_cmd' in write_bin_error(), then 'conn_parse_cmd' in
reset_cmd_handler(). When c->rbytes less than sizeof(c->binary_header) (24
bytes), the server believes that there's no enough bytes to hold binary header
so the processing continues normally. However, in the processing of attached
test program, c->rbytes increases by 1 each time a request is completed.
Therefore, after 24 requests, server (wrongly) thinks there's an extra binary
header, and it finally gets wrong magic in try_read_command().
The dead lock is caused by the error processing code. If the UDP connection is
in conn_closing state, it never get closed but only calls conn_cleanup().
Therefore, when there are incoming packets in the udp socket, they will never
be consumed, so epoll_wait() (in libevent) triggers input events over and over
again for a closing connection.
To set c->rbytes correctly:
--- ./memcached.ori.c 2010-08-23 22:59:12.000000000 +0800
+++ ./memcached.c 2010-08-23 22:59:17.000000000 +0800
@@ -3223,7 +3223,7 @@
res -= 8;
memmove(c->rbuf, c->rbuf + 8, res);
- c->rbytes += res;
+ c->rbytes = res;
c->rcurr = c->rbuf;
return READ_DATA_RECEIVED;
}
To consume incoming packets when cleanup:
--- ./memcached.ori.c 2010-08-23 22:59:12.000000000 +0800
+++ ./memcached.c 2010-08-23 23:27:38.000000000 +0800
@@ -471,6 +471,9 @@
sasl_dispose(&c->sasl_conn);
c->sasl_conn = NULL;
}
+
+ /* consume incoming packets */
+ recvfrom(c->sfd, NULL, 0, 0, NULL, NULL);
}
/*
Original comment by pi3or...@gmail.com
on 23 Aug 2010 at 3:35
The issue is studied and solved under the help of Snitchaser:
http://gitorious.org/snitchaser
Original comment by pi3or...@gmail.com
on 23 Aug 2010 at 3:36
this looks similar to the problem with issue 158, so i'll take a look at this
Original comment by eric.d.l...@gmail.com
on 11 Nov 2010 at 7:31
may be this specific issue is solved by the provided patch (haven't tried),
but in general the problem remains.
Even after applying the path suggested the following packet sent through udp
hangs memcached udp:
"\x00\x00\x00\x00\x00\x01\x00\x00\x80\x05\x00\x01\x08\x00\x00\x00\x00\x00\x00\x0
a\xef\xbe\xad\xde\x00\x00\x00\x00\x00\x00\x00\x00"
(in fact any subsequent packet causes 100% usage)
Is is a storage_command (in terms of testapp.c), but the cmd (\x05, increment)
is of type 'arithmetic_command' (in terms of testapp.c).
About this packet - all fields are 'correct' except for the type of cmd.
In the attachment - script that reproduces the livelock.
Original comment by airat.ha...@gmail.com
on 24 Jan 2011 at 5:16
Attachments:
Hello again! I (the author of http://gitorious.org/snitchaser) has disappeared
for nearly 10 month, and now come back to continue working on this bug.
Snitchaser now has been split into 2 projects: Snitchaser
(http://code.google.com/p/snitchaser/) for single threading programs and
ReBranch (http://code.google.com/p/rebranch/) for multi-threading programs. I
use ReBranch to analyze this bug, and solve it in 2 hours. In fact the root
cause is simple, I should have discover it 5 month ago.
The root cause is that, when memcached set a udp connection to conn_close
state, the connection will never come back again. Different from tcp
connections, a memcached server can have only 1 udp connection to serve all udp
requests. Hence, for an error connection, it only 'cleanup' the connection, not
free it. However, memcached never revert the state of a cleanup-ed connection.
Here we suggest a patch for memcached 1.3.5:
$ diff -u ./memcached.ori.c ./memcached.c
--- ./memcached.ori.c 2011-06-27 22:16:22.401000079 +0800
+++ ./memcached.c 2011-06-27 22:17:26.102000078 +0800
@@ -471,6 +471,11 @@
sasl_dispose(&c->sasl_conn);
c->sasl_conn = NULL;
}
+
+ if (IS_UDP(c->transport)) {
+ recvfrom(c->sfd, NULL, 0, 0, NULL, NULL);
+ conn_set_state(c, conn_new_cmd);
+ }
}
/*
@@ -3223,7 +3228,7 @@
res -= 8;
memmove(c->rbuf, c->rbuf + 8, res);
- c->rbytes += res;
+ c->rbytes = res;
c->rcurr = c->rbuf;
return READ_DATA_RECEIVED;
}
For udp connection, when closing, it consumes packets in socket then reset the
state to conn_new_cmd.
Original comment by pi3or...@gmail.com
on 27 Jun 2011 at 2:20
Can you issue a patch against 1.6.0-beta1? Or were you talking about 1.4.5, not
1.3.5?
Original comment by dorma...@rydia.net
on 27 Jun 2011 at 7:07
Ok. 1.6.0-beta1 still suffer from this problem, the principle is similar.
Following patch solve it (I thought).
--- ./daemon/memcached.ori.c 2011-06-28 09:43:45.120000072 +0800
+++ ./daemon/memcached.c 2011-06-28 09:57:18.390999971 +0800
@@ -689,13 +689,19 @@
c->sasl_conn = NULL;
}
- c->engine_storage = NULL;
- c->tap_iterator = NULL;
- c->thread = NULL;
- assert(c->next == NULL);
- c->ascii_cmd = NULL;
- c->sfd = INVALID_SOCKET;
- c->tap_nack_mode = false;
+ if (IS_UDP(c->transport)) {
+ recvfrom(c->sfd, NULL, 0, 0, NULL, NULL);
+ conn_set_state(c, conn_new_cmd);
+ } else {
+ c->sfd = INVALID_SOCKET;
+ c->engine_storage = NULL;
+ c->tap_iterator = NULL;
+ c->thread = NULL;
+ assert(c->next == NULL);
+ c->ascii_cmd = NULL;
+ c->tap_nack_mode = false;
+ }
+
}
void conn_close(conn *c) {
@@ -4826,7 +4832,7 @@
res -= 8;
memmove(c->rbuf, c->rbuf + 8, res);
- c->rbytes += res;
+ c->rbytes = res;
c->rcurr = c->rbuf;
return READ_DATA_RECEIVED;
}
Original comment by pi3or...@gmail.com
on 28 Jun 2011 at 2:00
I just pulled a patch similar to this into my 1.4.7 tree.
This was confusing me a bit... the extra recvfrom seemed pointless and it'd
hang while running. The test from #158 would spin out of control without the
recvfrom.
But the initial state of a UDP connection is actually conn_read, not
conn_new_cmd. Using the correct state makes it find the next UDP packet and
throw it away properly.
Still iffy on the rbytes bit, but so long as multiple command parsing works
(which I don't think it does now anyway), that should be correct.
Punting to trond in case he wants to port this properly to 1.6.0, or take a
look at least. I'll be pushing my for_147 branch to github at some point.
Original comment by dorma...@rydia.net
on 8 Aug 2011 at 5:44
The recvfom() use 0 as SIZE and NULL as BUFFER. According to manual, that call
will consume the incoming packet. Without it, epoll_wait() will always return
immediately because the UDP socket is still in readable state.
-- Function: int recvfrom (int SOCKET, void *BUFFER, size_t SIZE, int
FLAGS, struct sockaddr *ADDR, socklen_t *LENGTH-PTR)
The `recvfrom' function reads one packet from the socket SOCKET
into the buffer BUFFER. The SIZE argument specifies the maximum
number of bytes to be read.
If the packet is longer than SIZE bytes, then you get the first
SIZE bytes of the packet and the rest of the packet is lost.
There's no way to read the rest of the packet. Thus, when you use
a packet protocol, you must always know how long a packet to
expect.
Original comment by pi3or...@gmail.com
on 8 Aug 2011 at 6:06
The incoming packet is consumed in try_read_udp(), so if you get into a
conn_close state after reading one packet, calling an empty recvfrom will
consume the *next*, unseen packet.
Unless there is some case you're talking about which causes the next packet on
the wire to be an error?
I did push my tree though, so take a look at what I ended up with:
https://github.com/dormando/memcached/commit/954f6dddc41c80d2e625bce63744df5556a
425bb
... I can't break it with your tests anymore, and it doesn't hang before
responding. If you look at how TCP connections are intialized, you'll see
they're set to conn_new_cmd, but the UDP listeners are initialized to
conn_read. The latter makes sense in the flow.
Either way, can you try the patch and see if you can still break it? :)
Original comment by dorma...@rydia.net
on 8 Aug 2011 at 6:44
I think you are right.
In my first patch (Comment 1), the recvfrom() call is critical because in that
situation, if a UDP socket is in conn_closing state, it will never be
recovered, all incoming packets should be dropped. If we recover its state to
conn_read, there's no need for the recvfrom() again.
Original comment by pi3or...@gmail.com
on 8 Aug 2011 at 7:22
think this was merged up. closing.
Original comment by dorma...@rydia.net
on 28 Sep 2011 at 4:45
Original issue reported on code.google.com by
marlon%h...@gtempaccount.com
on 12 Nov 2009 at 4:05Attachments: