Jdesk / memcached

Automatically exported from code.google.com/p/memcached
0 stars 0 forks source link

Single packet DoS on UDP channel #158

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Download attached script
2. Start a memcached server on a UDP port
3. Run "python largeudp.py a <server_ip> <port> 1200"

What is the expected output? What do you see instead?
The server is expected to continue processing UDP traffic after the script has 
copmleted.

Instead, it no longer responds to any traffic sent via UDP, legitimate or 
otherwise.

Note: while UDP packets are no longer processed, the TCP channel remains 
operational.

What version of the product are you using? On what operating system?
Tested on memcached-1.4.5, libevent-2.0.7-rc. Linux-2.6.24-28

Please provide any additional information below.

The attached script takes four arguments:
arg-1: either "a" or "b", to use the "a"scii or "b"inary protocol
arg-2: target ip
arg-3: target port
arg-4: length of packet to generate. 1200 works nicely for hanging 1.4.5 while 
not exceeding MTUs.

Original issue reported on code.google.com by marcolsl...@gmail.com on 5 Oct 2010 at 10:02

Attachments:

GoogleCodeExporter commented 9 years ago
I omitted to mention that once the UDP channel stops responding, the memcached 
process consumes 100% cpu; in this way the bug indirectly affects users of the 
TCP channel too as well as other processes on the same machine.

Original comment by marcolsl...@gmail.com on 6 Oct 2010 at 9:50

GoogleCodeExporter commented 9 years ago

Original comment by eric.d.l...@gmail.com on 6 Oct 2010 at 5:28

GoogleCodeExporter commented 9 years ago
Just FYI, was not able to reproduce this on mac using 1.4.5 with libevent 
2.07-rc ... will take a look on linux when I get a chance.

smacky:Downloads elambert$ python ./largeudp.py a localhost 11211 1200
[-] Using ascii protocol. Good choice!
[-] Creating socket
[-] Is memcached responding?  Yes
[-] Great, server responds
[-] Sending large packet with size 1208
[-] Timeout expired waiting for response to large packet
[-] Is memcached responding?  Yes
[E] DoS appears to have failed
smacky:Downloads elambert$ echo "stats" | nc localhost 11211 | egrep 
"version|libevent"
STAT version 1.4.5_1_gb4936c4
STAT libevent 2.0.7-rc
smacky:Downloads elambert$ uname -a
Darwin smacky.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 
PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386 i386
smacky:Downloads elambert$ 

Original comment by eric.d.l...@gmail.com on 7 Oct 2010 at 8:58

GoogleCodeExporter commented 9 years ago
Thanks for the feedback, apologies for supplying a PoC that didn't demo the 
issue sufficiently, that was pretty fail for a PoC. However, I believe the bug 
to still be present, as I'll show below. I'm less sure as to the cause: large 
packet sizes trigger the bug sooner, but using a packet size of 10 also 
triggers the bug on my side, eventually.

I can trigger the bug on Snow Leopard (Darwin insurrection.local 10.4.0 Darwin 
Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; 
root:xnu-1504.7.4~1/RELEASE_I386 i386) by running the PoC four times (works for 
both ascii and binary protocols) using a packet size of 1200. This too is the 
case on Linux,  repeating the PoC four times triggers the bug.

Below is an OSX backtrace after the bug had been triggered, and memcached was 
consuming full CPU:
(gdb) bt 
#0  0x00007fff85ab508a in kevent ()
#1  0x00000001000300e3 in kq_dispatch ()
#2  0x000000010002380e in event_base_loop ()
#3  0x0000000100002be7 in main (argc=3, argv=0x7fff5fbff9b0) at memcached.c:4681
(Note: the libevent for the OSX version was 1.4.2. Latest libevent was on Linux)

Not much to go on, I'm afraid.

It may also help to mention that if "-v -v" is enabled then once the bug is 
triggered, no further debug messages are printed to the console for UDP comms. 
TCP traffic still generates the regular debug notices.

Original comment by marcolsl...@gmail.com on 7 Oct 2010 at 11:08

GoogleCodeExporter commented 9 years ago
I think i have somewhat of an idea as to what is happening. The UDP connections 
behave a little bit different than TCP. At start up, the server creates a UDP 
"connection handler" for each thread in the server (as controlled by the -t 
option). By default there are 4 threads so there will be 4 UDP "connection 
handlers". 

What appears to be happening is that the message you are sending is not a 
valid/wellformed message which causes the server to close the "connection 
handler" in such a way that it can not be used again. So in a default scenario 
with the server running 4 threads, the first time you run your test one connect 
handler is closed leaving three left, the second run closes a second handler 
leaving two left and so on. If you start the server with only one thread (-t 1) 
you should hit the issue on the first run.

I dont have a root cause yet, but the issue does appear to be related to how 
the server clears up the connection handler when it encounters a problem. Will 
let you know when I've figured this out

Original comment by eric.d.l...@gmail.com on 8 Oct 2010 at 4:48

GoogleCodeExporter commented 9 years ago

Original comment by eric.d.l...@gmail.com on 8 Oct 2010 at 4:49

GoogleCodeExporter commented 9 years ago
Well spotted, I was indeed running the initial tests with -t 1. At least that 
clears up some minor confusion this side.

Original comment by marcolsl...@gmail.com on 8 Oct 2010 at 9:26

GoogleCodeExporter commented 9 years ago
possible the same cause as for:
http://code.google.com/p/memcached/issues/detail?id=106

Original comment by airat.ha...@gmail.com on 13 Dec 2010 at 10:53

GoogleCodeExporter commented 9 years ago
if you end up the message with "\r\n" - you won't be able to reproduce the bug.

Original comment by airat.ha...@gmail.com on 14 Dec 2010 at 3:19

GoogleCodeExporter commented 9 years ago
The patch for 1.4.7 related to #106 fixes this guy too.

Going to close out this bug and leave any further discussion on the earlier 
issue.

Original comment by dorma...@rydia.net on 8 Aug 2011 at 5:46