roc230 / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

Operation queue doesn't get consumed #90

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What version of the product are you using? On what operating system?
2.3.1

Tell me more...
After running a stress test for several hours, I'm seeing my queue fill up 
completely to the point where if I try to add more operations I get buffer 
full exceptions.

However, several hours later, the client seems hung with the very same read 
operation sitting at the top of the input queue.

---------------------------------------------------------------------

@ 2009-09-29 06:08:38,285:

Server Overloaded: RawCacheConnection: Cache Error: {QA 
sa=/10.5.3.236:11211, #Rops=2, #Wops=1, #iq=16384, 
topRop=net.spy.memcached.protocol.binary.GetOperationImpl@35060dc2, 
topWop=net.spy.memcached.protocol.binary.GetOperationImpl@207156c0, 
toWrite=0, interested=4}

***************************************************

java.lang.IllegalStateException: Queue full
    at java.util.AbstractQueue.add(AbstractQueue.java:99)
    at 
java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:237)
    at 
net.spy.memcached.protocol.TCPMemcachedNodeImpl.addOp(TCPMemcachedNodeImpl.
java:226)
    at 
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java
:542)
    at 
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java
:533)
    at 
net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:244)
    at 
net.spy.memcached.MemcachedClient.asyncStore(MemcachedClient.java:275)
    at net.spy.memcached.MemcachedClient.set(MemcachedClient.java:619)

---------------------------------------------------------------------

@ 2009-09-29 00:03:24,247:

Server Overloaded: RawCacheConnection: Cache Error: {QA 
sa=/10.5.3.236:11211, #Rops=2, #Wops=1, #iq=16384, 
topRop=net.spy.memcached.protocol.binary.GetOperationImpl@35060dc2, 
topWop=net.spy.memcached.protocol.binary.GetOperationImpl@207156c0, 
toWrite=0, interested=4}

***************************************************

java.lang.IllegalStateException: Queue full
    at java.util.AbstractQueue.add(AbstractQueue.java:99)
    at 
java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:237)
    at 
net.spy.memcached.protocol.TCPMemcachedNodeImpl.addOp(TCPMemcachedNodeImpl.
java:226)
    at 
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java
:542)
    at 
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java
:533)
    at 
net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:244)
    at 
net.spy.memcached.MemcachedClient.asyncStore(MemcachedClient.java:275)
    at net.spy.memcached.MemcachedClient.set(MemcachedClient.java:619)

Original issue reported on code.google.com by lewiszim...@gmail.com on 29 Sep 2009 at 7:45

GoogleCodeExporter commented 9 years ago
Traced down the cause of the behavior, as well as some other issues with 
spymemcached clients just 
basically taking themselves out of service:

In net.spy.memcached.protocol.binary.OperationImpl:

    /**
     * Generate an opaque ID.
     */
    static int generateOpaque() {
        int rv=seqNumber.incrementAndGet();
        while(rv < 0) {
            if(seqNumber.compareAndSet(rv, 0)) {
                rv=seqNumber.incrementAndGet();
            }
        }
        return rv;
    }

If the cas op on seqNumber fails, rv never gets updated and will loop until 
seqNumber wraps at least one 
more time. proposed fix increments seqNumber whether or not the cas succeeds: 
if another thread was able 
to swap to 0 before we were, we're still good to go.

    /**
     * Generate an opaque ID.
     */
    static int generateOpaque() {
        int rv=seqNumber.incrementAndGet();
        while(rv < 0) {
            seqNumber.compareAndSet(rv, 0);
            rv=seqNumber.incrementAndGet();
        }
        return rv;
    }

Original comment by lewiszim...@gmail.com on 13 Oct 2009 at 6:02

GoogleCodeExporter commented 9 years ago
Hey, thanks a lot for the fix.  That makes a lot of sense.

I committed 4e4b9ff3afba4282fa21d719fe9095be3fc97cc7 as you.

Original comment by dsalli...@gmail.com on 13 Oct 2009 at 8:32