lishunli / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

hit an IO Thread is not running assertion #36

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What version of the product are you using? On what operating system?
I'm running the 2.1 client on RHEL4. 

Tell me more...

We just recently saw this in some of our standard test runs. It isn't clear
how the thread ever would've gotten out of the is running state.

I suppose we will now need to catch this sort of thing and throw away the
client and replace it. 

Any ideas what's going on?

 Cause0: java.lang.AssertionError: IO Thread is not running.
 Cause0-StackTrace: 
        at
net.spy.memcached.MemcachedClient.checkState(MemcachedClient.java:236)
        at net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:249)
        at net.spy.memcached.MemcachedClient.asyncGet(MemcachedClient.java:658)
        at net.spy.memcached.MemcachedClient.asyncGet(MemcachedClient.java:670)
        at
common.cache.ntc.MemcachedClientAdapter.get(MemcachedClientAdapter.java:72)
        at
common.cache.ntc.MemcachedCacheClient.get(MemcachedCacheClient.java:138)
...

Original issue reported on code.google.com by kevinoliver on 15 Sep 2008 at 4:46

GoogleCodeExporter commented 9 years ago
I added an Thread.UncaughtExceptionHandler to log what error was leaking out. 

Note, this is happening somewhat regularly for us.

java.lang.AssertionError: No read operation
    at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:299)
    at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:262)
    at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:182)
    at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:1259)

Original comment by kevinoliver on 25 Sep 2008 at 7:55

GoogleCodeExporter commented 9 years ago
Thanks for the update.  That's quite a strange error.  Apparently I read data 
from
memcached, but nobody wanted it.  Might be a good time to shut down the 
connection
and reinitialize it.

Do you have a test case that can reproduce this?  I'd kind of like to see what's
happening on the wire.

Original comment by dsalli...@gmail.com on 25 Sep 2008 at 8:43

GoogleCodeExporter commented 9 years ago
Well... It seems to happen somewhat regularly when I try to set a value that is
larger than 1MB. We try to catch the "OperationException: SERVER: SERVER_ERROR 
object
too large for cache" error and put the value into the cache as smaller chunks. 

I don't know if this helps: this client is talking to a memcached server running
locally on the same host. We are using the DefaultConnectionFactory and we 
using the
ascii protocol.

As for a test case. It looks a bit like this (I've edited out a bunch of our 
code
that sits above your client, but it basically boils down to this). I haven't 
yet run
into this error locally on my machine, but our test harness hits this fairly 
regularly.

        // insert a value greater than 1 MB in size...
        // let's put "random" data in that will not compress down...
        byte[] data = new byte[(1024*1024 * 15) + 111];
        Random random = new Random(1001);
        random.nextBytes(data);

        // verify that the compressed size is greater than 1MB.
        int compressedSize = getCompressedSize(data);
        assertTrue("compressed size should be larger than 1MB: " + compressedSize,
compressedSize > 1024*1024);

        // this line is what seems to trigger the error
        memcachedClient.set("testChunkedPutAndGet", 0, data);

Please let me know if you need any additional info. 

My 2 cents is that it feels like the MemcachedClient.run() loop should never 
allow
any Throwable to make it exit it's run loop. Eg, run() would just look like 
this:

@Override public void run() {
  while(running) {
    try {
      conn.handleIO();
    } catch (Throwable t) {
      logRunException(t);
    }
  }
  getLogger().info("Shut down memcached client");
}

Original comment by kevinoliver on 25 Sep 2008 at 10:21

GoogleCodeExporter commented 9 years ago
I think this is another report of this problem:

Original comment by dsalli...@gmail.com on 1 Oct 2008 at 10:51

Attachments:

GoogleCodeExporter commented 9 years ago
Yep, looks like the same issue to me. We have net.spy.memcached logging set to
warning only, so I don't see a bunch of those log messages.

You said you think this is another report of this problem... Does that mean 
there is
already another issue open with similar issues? Or have you been able to 
reproduce?

Original comment by kevinoliver on 1 Oct 2008 at 11:07

GoogleCodeExporter commented 9 years ago

Original comment by dsalli...@gmail.com on 2 Oct 2008 at 7:03

GoogleCodeExporter commented 9 years ago

Original comment by dsalli...@gmail.com on 2 Oct 2008 at 7:07

GoogleCodeExporter commented 9 years ago
I've fixed this so that instead of being an assertion it'll more directly cause 
a
reconnect.

Original comment by dsalli...@gmail.com on 3 Oct 2008 at 4:27