uohzoaix / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

operation timeout exception #68

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
CentOS release 5.2
memcache client version 2.2 
App server -Apache Tomcat Version 5.5.27
========= ISSUE =================
We are making connections to memcache from the code that runs in the app
server. Ever so often we see these such exceptions 

FIRST ***************************
net.spy.memcached.OperationTimeoutException: Timeout waiting for bulkvalues
        at net.spy.memcached.MemcachedClient.getBulk(MemcachedClient.java:909)
        at net.spy.memcached.MemcachedClient.getBulk(MemcachedClient.java:924)
        at
com.equals.persistence.cache.memcached.MemcachedCacheService.getBatch(MemcachedC
acheService.java:448)
        at
com.equals.persistence.cache.memcached.MemcachedCacheService.fillCollections(Mem
cachedCacheService.java:374)
        at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav
a:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils
.java:310)
        at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.j
ava:198)
        at $Proxy0.fillCollections(Unknown Source)

SECOND ***************************

net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:765)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:778)
        at
com.equals.persistence.cache.memcached.MemcachedCacheService.get(MemcachedCacheS
ervice.java:252)
        at
com.equals.persistence.cache.memcached.MemcachedCacheService.read(MemcachedCache
Service.java:306)
        at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav
a:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils
.java:310)
        at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.j
ava:198)
        at $Proxy0.read(Unknown Source)
=================================================
I wish I could provide you a test case to reproduce this. Its not
deterministically repeatable but will definately appear once a while. Once
it happens all cache operations start to fail. Just restarting the app
server (which also leads to killing the client instance) solves the problem
so it seems related with the client.

When this issue occurs I have tried to create a jsp and create a new cache
client to connect to the cache. that works.

Please let me know if you know how to rectify this or if I can provide more
information to help diagnose and solve this. 

Original issue reported on code.google.com by amant...@gmail.com on 28 May 2009 at 4:06

GoogleCodeExporter commented 9 years ago
I have similar exception too
OperationTimeoutException: Timeout waiting for value

I could not reproduce it but it shows up in product log periodically.

Original comment by hank.h...@gmail.com on 31 May 2009 at 7:20

GoogleCodeExporter commented 9 years ago
The default operation timeout is 1 second.  If anything ever happens that takes
longer than a second, you'd see this.

You can adjust the timeout up or down in the connection factory.

I'm marking this as invalid as it's not so much a bug, but a way of things.  If
you're getting it a lot with the default settings, you may want to increase it.

Original comment by dsalli...@gmail.com on 31 May 2009 at 7:56

GoogleCodeExporter commented 9 years ago
I'm getting the same issue, Timeouts on gets and on sets with small values that 
should be instantaneous.  It 
almost seems like if my client connection sits idle for 2 or more hours then 
the next set of operations (within a 
few minutes) will timeout.  Then the client starts working again until the next 
2 hours of inactivity.

I can't find in the documentation or in the settings of the ConnectionFactory 
(or DefaultConnectionFactory) 
where I can increase the timeout above 1 second.

Original comment by john.ice...@gmail.com on 5 Aug 2009 at 4:14

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
It's a pity that this is a marked invalid. This is most  certianly a bug the 
library
and NOT just a config setting that causing this. See
http://raibledesigns.com/rd/entry/performance_testing_memcached#comments (the 
last
comment)

I hope this gets some serious attention sometime soon.

Original comment by amant...@gmail.com on 5 Aug 2009 at 4:43

GoogleCodeExporter commented 9 years ago
OK.  It seems that this is happening in a few environments.  I'll see if I can 
dig up
a theory as to the cause.

Original comment by dsalli...@gmail.com on 5 Aug 2009 at 5:01

GoogleCodeExporter commented 9 years ago
I'm seeing the same thing when attempting to do any operation.  I'm having a 
hard time thinking that the operation actually timed out, seeing as both this 
app and memcached are 
running on the same development machine and neither is under any load 
whatsoever.  I've used spymemcached in the web portion of our app without 
problem, but using it now in 
a standalone app I keep hitting this error.

For example, a gets("key") results in:

net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.gets(MemcachedClient.java:802)
        ...
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 
operation
        at net.spy.memcached.MemcachedClient$OperationFuture.get(MemcachedClient.java:1656)
        at net.spy.memcached.MemcachedClient.gets(MemcachedClient.java:795)
        ... 11 more

get("key") results in:

net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:840)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:855)
        ...
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 
operation
        at net.spy.memcached.MemcachedClient$OperationFuture.get(MemcachedClient.java:1656)
        at net.spy.memcached.MemcachedClient$GetFuture.get(MemcachedClient.java:1708)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:833)
        ... 12 more

add("key", 60 * 60 * 24, "myobject") results in:

java.lang.RuntimeException: Timed out waiting for operation
        at net.spy.memcached.MemcachedClient$OperationFuture.get(MemcachedClient.java:1648)
        ...
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 
operation
        at net.spy.memcached.MemcachedClient$OperationFuture.get(MemcachedClient.java:1656)
        at net.spy.memcached.MemcachedClient$OperationFuture.get(MemcachedClient.java:1646)
        ... 11 more

Original comment by RyanWils...@gmail.com on 7 Aug 2009 at 9:10

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Seems to be I was using the binary protocol and had forgotten to upgrade my 
local memcached, oops.

Original comment by RyanWils...@gmail.com on 7 Aug 2009 at 11:37

GoogleCodeExporter commented 9 years ago
I set timeout 1 default it wrorks fine but as I increases it to 5 it give me 
error 
fail to connect upto 5 ... 

Original comment by ashis...@gmail.com on 12 Aug 2009 at 8:06

GoogleCodeExporter commented 9 years ago
rayan
how you upgrade your local memcached.It seems i have same problem binary 
protocol 
not working as default one make a client.

Original comment by ashis...@gmail.com on 17 Aug 2009 at 12:18

GoogleCodeExporter commented 9 years ago
I can get this to reproduce more-or-less consistently.  It seems to happen only 
when
the data being saved is large enough.  I have a Java app connecting to a 
memcache
instance on a different machine.  With the default timeout of 1s, it starts 
happening
when the data size is as small as 20k.  As it approaches 1MB, it becomes 
consistent.
 Setting the timeout high enough, it seems to take around 5s when the data size is
around 600kb.  On a few runs with a 100kb data size, most of the time it would 
take
2.2s, but occasionally it would take half that.  Setting the value takes around 
260ms
consistently.

I tested this with 2.1, 2.3.1, and 2.4.  2.1 does *NOT* have this problem --
retrieving a 1MB payload took about 250ms consistently.  This only happened for 
me in
2.3.1 and 2.4.

Here's the code I tested with:

    MemcachedClient mc = new MemcachedClient(AddrUtil.getAddresses(MC_HOST));
        try {
                StringBuilder sb = new StringBuilder();
                for(int i = 0; i < 1400000; i++) {
                        char c = (char)('A' + (char)(Math.random() * 26.0));
                        sb.append(c);
                }
                long start = System.currentTimeMillis();
                mc.set("TEST_JAVA_VALUE", 0, sb.toString());
                System.out.println("Set took " + (System.currentTimeMillis() - start)
+ "ms.");

                start = System.currentTimeMillis();
                String value = (String)mc.get("TEST_JAVA_VALUE");
                System.out.println("Retrieve took " + (System.currentTimeMillis() -
start) + "ms.");
        } finally {
                mc.shutdown();
        }

Original comment by samps...@gmail.com on 21 Oct 2009 at 2:07

GoogleCodeExporter commented 9 years ago
Thanks a lot for this test.  That gives me a lot to work with.

Original comment by dsalli...@gmail.com on 21 Oct 2009 at 6:22

GoogleCodeExporter commented 9 years ago
can you give me a couple more details of your setup?  I don't see a difference 
in my
runs on my mac (localhost or across wifi) between 2.1 and 2.4.  :/

Original comment by dsalli...@gmail.com on 21 Oct 2009 at 6:35

GoogleCodeExporter commented 9 years ago
One note about the code above -- it creates a random string that's 1.4MB in 
size.  I
was testing a compressed transcoder, so the result was compressed data < 1MB. 
However, this problem does happen with large data sizes whether or not it uses 
the
transcoder.

The memcached server is running v1.2.2 on Linux, using Java 6 in both cases.  I
tested the client on a Windows XP and a Linux box with the same results.  In all
cases the client and server were on separated systems in a fast network.

Let me know if you need any additional info.

Original comment by samps...@gmail.com on 21 Oct 2009 at 6:51

GoogleCodeExporter commented 9 years ago
We have encountered this same issue in production. 
In our case it appears to be caused by doing a large number of operations 
(>8000) in
a tight loop. 

I will update later with specific version numbers (and hopefully a test case)

Original comment by buzman...@gmail.com on 21 Oct 2009 at 7:15

GoogleCodeExporter commented 9 years ago
We have the same problem in test environment . it say:
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient$OperationFuture.get
(MemcachedClient.java:1656)

we use version 2.3.1.

Original comment by yan...@gmail.com on 21 Oct 2009 at 10:06

GoogleCodeExporter commented 9 years ago
CLient run in Centos X64 v5.3

Original comment by yan...@gmail.com on 21 Oct 2009 at 10:07

GoogleCodeExporter commented 9 years ago
Yes. Same for us. even when timeout is 3 seconds and all instances are on the 
same
lan. (using this with hibernate-memcached)
(ver 2.3.1)

Original comment by enricoma...@gmail.com on 21 Oct 2009 at 10:10

GoogleCodeExporter commented 9 years ago
Our issue is ocurring on a pair of memcached nodes running 1.2.8, with version 
2.3.1
of Spy

Original comment by buzman...@gmail.com on 21 Oct 2009 at 10:30

GoogleCodeExporter commented 9 years ago
I'm looking into this issue. Thank you to all for the input.

Original comment by CaptTo...@gmail.com on 21 Oct 2009 at 8:45

GoogleCodeExporter commented 9 years ago
This is the test from above in a compilable file.  Still needs to be edited to 
point
to the memcached server you're using.

I tried it out with a couple different versions of spymemcached in the 
classpath and
all was well.  Hoping CaptTofoo can produce an environment that triggers the 
failure.

Original comment by dsalli...@gmail.com on 21 Oct 2009 at 8:59

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by CaptTo...@gmail.com on 21 Oct 2009 at 9:08

GoogleCodeExporter commented 9 years ago
it would be better to add multi-thread in your testing program to simulate the 
real case.

Original comment by hank.h...@gmail.com on 21 Oct 2009 at 9:26

GoogleCodeExporter commented 9 years ago
Hi. I am not contributing to the fixing of the issue. But I want to remind 
people that 
memcached is not for storing large chunks of data. Don't forget it is just 
cache, not 
the database. So even though there might be problems with dsallings code, 
memcached 
servers could choke as well. See this FAQ from memcached site: 
http://code.google.com/p/memcached/wiki/WhyNotMemcached. 

Original comment by dmitry.s...@gmail.com on 21 Oct 2009 at 11:03

GoogleCodeExporter commented 9 years ago
it is hard to say if it is caused by large data size or too many current 
thread. For
my case, I create a driver as the second level cache for Hibernate to cache 
objects
into the memcached. each object is not big, but I keep get the timeout errors in
production. It does not affect the site functions though.

Original comment by hank.h...@gmail.com on 22 Oct 2009 at 12:02

GoogleCodeExporter commented 9 years ago
I'm fairly certain this isn't a threading issue.  The threading code is pretty 
clean
(really tiny points of interleaving using well-defined techniques).  I do some
strategically placed multi-threaded tests to as concurrently as possible hit 
some
entry points (just to be safe).

Of course, I'd be happy to have someone point out how I'm wrong.

This bug is in no doubt plaguing lots of people.  I'm hoping to see some 
results with
the useful information people have found here.

Original comment by dsalli...@gmail.com on 22 Oct 2009 at 1:35

GoogleCodeExporter commented 9 years ago
Issue 74 has been merged into this issue.

Original comment by dsalli...@gmail.com on 27 Oct 2009 at 4:23

GoogleCodeExporter commented 9 years ago
Issue 83 has been merged into this issue.

Original comment by dsalli...@gmail.com on 27 Oct 2009 at 4:25

GoogleCodeExporter commented 9 years ago
I've finally been able to reproduce this.  I've pushed a code fix and will be
releasing 2.4.2 with the fix real soon now.

Original comment by dsalli...@gmail.com on 27 Oct 2009 at 5:41

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Hi all,
I'm having the same problem on a dev environment when we perform stress tests.
When will 2.4.2 be released?
Has anybody created a workaround for this issue? like i.e. a wrapper around the
MemcachedClient to throttle the amount of requests sent...

Original comment by matias...@gmail.com on 15 Dec 2009 at 3:02

GoogleCodeExporter commented 9 years ago
That was released on October 27.

Original comment by dsalli...@gmail.com on 15 Dec 2009 at 5:07

GoogleCodeExporter commented 9 years ago
Thanks a lot

Original comment by matias...@gmail.com on 15 Dec 2009 at 5:13

GoogleCodeExporter commented 9 years ago
I'm still getting the same timeouts with 2.4.2. So yeah, now the exception 
tells me
which node timed out. funny thing is that my dev environment is one memcached 
node on
the localhost, obviously that's the one that fails since it's the only one :)
in our production environment it's various nodes, but the timeouts don't have 
any
apparent reason for them.
so.. 2.4.2 didn't fix for us..

Any thoughts?

Original comment by enricoma...@gmail.com on 16 Dec 2009 at 3:44

GoogleCodeExporter commented 9 years ago
I'm also getting the same timeouts with 2.4.2. I've installed memcached 1.4.4 
on my
local windows box and accessing with 2.4.2 client code. Any clue?

Original comment by sharath....@gmail.com on 17 Dec 2009 at 7:06

GoogleCodeExporter commented 9 years ago
i'm not sure whether my Q falls into this. I'm using memcached 1.4.4.on windows/
client 2.4.2.

If i set simple strings, all objects are getting populatd into memcached but if 
i
store object size of 2 kb , it's not storing all the objects (Out of 500, it 
stores
only 200). How can i increase the operation timeout limit during set()? 

Original comment by sharath....@gmail.com on 17 Dec 2009 at 7:17

GoogleCodeExporter commented 9 years ago
If you're getting timeouts, can you attach config and relevant logs?  This is 
very
likely different from what I fixed here.  It's not hard to get timeouts by
misconfiguring something.

The logs may help, but if you're speaking the wrong protocol, for example, it's 
going
to have a hard time figuring that out.

Original comment by dsalli...@gmail.com on 17 Dec 2009 at 7:22

GoogleCodeExporter commented 9 years ago
@sharath.kumar -- no, you're seeing something else.  You should probably take 
your
question to the list with the code you're using for your sets.  I'm guessing 
there's
a bug in your code.

Original comment by dsalli...@gmail.com on 17 Dec 2009 at 7:24

GoogleCodeExporter commented 9 years ago
Thanks for the quick response. Here is my code which try to set 500 simple 
category
objects but i see only around 200 objects stored in memcached.

List serversList = AddrUtil.getAddresses(servers);
MemcachedClient memcachedClient =  new MemcachedClient(new
DefaultConnectionFactory(), serversList);

for(int i = 0; i < 500; i++) {
  String cacheKey = "MEMCACHED_TEST_CLIENT:DEV:spy:" + i;
  // preparting simple object
  Category category = CategoryDataUtil.getInstance().getCategory(1); 
  memcachedClient.set(cacheKey, 0, category);
}

Please let me know if i'm doing something wrong here.Thanks!

Original comment by sharath....@gmail.com on 17 Dec 2009 at 8:05

GoogleCodeExporter commented 9 years ago
@sharath.kumar

You're posting unrelated questions on a closed bug.  Go here: 
http://groups.google.com/group/spymemcached

Original comment by dsalli...@gmail.com on 17 Dec 2009 at 8:25