roc230 / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

Stability problems in async I/O #197

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What version of the product are you using? On what operating system?

spymemcached-2.7, memcached-1.4.5

$ uname -a
Linux tgpdev07 2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC 2011 x86_64 
GNU/Linux
$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
$

The system is a Debian 6.0 on a 4 core Intel Xeon box (5140  @ 2.33GHz).

Tell me more...

I'm trying to set 10 Mio small values using the attached program. Client and 
server run on the same box, the server has 2 GB RAM (-m 2048). Other 
experiments show that the server is capable of holding this much data.

Log output shows this (see attached log):

2011-08-23 09:40:03.598 INFO net.spy.memcached.MemcachedConnection:  Added {QA 
sa=localhost/127.0.0.1:4711, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, 
toWrite=0, interested=0} to connect queue
2011-08-23 09:40:03.602 INFO net.spy.memcached.MemcachedConnection:  Connection 
state changed for sun.nio.ch.SelectionKeyImpl@2c64f6cd
Setting values ... 0
2011-08-23 09:40:05.093 INFO net.spy.memcached.MemcachedConnection:  
sun.nio.ch.SelectionKeyImpl@2c64f6cd has 1, interested in 5
2011-08-23 09:40:05.094 INFO net.spy.memcached.MemcachedConnection:  
sun.nio.ch.SelectionKeyImpl@2c64f6cd has a ready op, handling IO
[...]

According to the memcached's stats command, only 2.5 Mio entries made it into 
the cache for this run, the exact numbers differ from run to run. On my local 
Ubuntu box I've seen this succeed with the exact same software versions on 
kernel 2.6.38.

A different run shows timeouts (see error-2.txt.gz).

Original issue reported on code.google.com by matthias.j.friedrich on 23 Aug 2011 at 7:56

Attachments:

GoogleCodeExporter commented 9 years ago
Sets are asynchronous, so what's happening is you're overwhelming the JVM with 
object futures, and then they're timing out before they can actually get to the 
server.  

For instance if you changed line 28 from:
client.set("key_" + i, 600, value.getBytes());
to:
client.set("key_" + i, 600, value.getBytes()).get();

It'd be slower, but all of the items would get there since you'd be waiting for 
each one to complete before sending another.  I think this is what you thought 
your code was doing.

There are a few variations on this too, like filling a blocking queue and 
grabbing items from the other side to send to the server, rather than creating 
requests super fast, that the network IO has to take time to deal with.

In fact, your test looks a bit like the built in LoaderTest:
https://github.com/dustin/java-memcached-client/blob/master/src/test/manual/net/
spy/memcached/test/LoaderTest.java

Original comment by ingen...@gmail.com on 23 Aug 2011 at 5:32

GoogleCodeExporter commented 9 years ago
I don't really understand, why this bug marked as "Invalid".
Currently the objects net.spy.memcached.protocol.ascii.StoreOperationImpl, 
net.spy.memcached.protocol.ascii.StoreOperationImpl, and others keep growing in 
the heap without any limit. As a result - the heap is exhausted, and we have 
big Full GC stop-the-world pauses.

The right behavior shall be the following: these objects shall be put to the 
queue with configurable size and configurable RejectedExecutionHandler 
(CallerRunsPolicy or AbortPolicy).

Original comment by anton.mi...@gmail.com on 19 Oct 2011 at 11:40

GoogleCodeExporter commented 9 years ago
I don't understand why the client should be responsible for either throtteling 
requests or querying in a stop-and-wait fashion (which performs poorly, 
obviously). That's the library's job, IMHO.

Original comment by matthias.j.friedrich on 19 Oct 2011 at 12:30

GoogleCodeExporter commented 9 years ago
it is invalid because you're using the default queue with the default 
constructor.  you can specify a queue for the client to use, allowing you to 
control resources as you request 

Original comment by ingen...@gmail.com on 19 Oct 2011 at 2:02