bigdata4u / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

ConcurrentModificationException when using the getStats(String args) method #102

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Using spymemcached 2.3.1 on Linux using memcached 2.8.2

When using the method 

public Map<SocketAddress, Map<String, String>> getStats(final String arg)

of the MemcachedClient class I sometimes run into a
java.util.ConcurrentModificationException when trying to iterate over the
sub-maps.

I.e. I do something like this:

      final Map< SocketAddress , Map< String , String >> result = new
HashMap< SocketAddress , Map< String , String > >();
      for ( final Map.Entry< SocketAddress , Map< String , String >> e :
memcachedclient.getStats( MessageFormat.format( "cachedump %d %d\r\n" ,
slabNumber , size ).entrySet() ) {
        result.put( e.getKey() , e.getValue() ); // Here it comes *bang*
      }

After looking at the code and some googling I tried to fix this by wrapping
the inner maps into a ConcurrentHashMap but obviously this won't fix the
problem.

I tried upgrading to 2.4.2 but the problem still persists on that version.

IMHO the problem is that the map is somehow manipulated _after_ the result
is returned. I didn't grok your code, though, I only use it, and it's
working very fine, except for the above mentioned. :-) 

Original issue reported on code.google.com by daniel.h...@gmail.com on 3 Nov 2009 at 4:10

GoogleCodeExporter commented 8 years ago
I think you introduced some bugs when extracting code from your example.  I 
modified
it slightly (compilation problem, using MessageFormat as printf and got rid of 
the
incorrect \r\n) and it worked fine.

MemcachedClient client=new MemcachedClient(
    new ConnectionFactoryBuilder()
        .setOpTimeout(15000).build(), AddrUtil
        .getAddresses("127.0.0.1:11211"));

int slabNumber=1;
int size=10000;

final Map<SocketAddress, Map<String, String>> result =
    new HashMap<SocketAddress, Map<String, String>>();
for(final Map.Entry<SocketAddress, Map<String, String>> e : client
        .getStats(
                  String.format("cachedump %d %d", slabNumber,
                                size)).entrySet()) {
    result.put(e.getKey(), e.getValue());
}

Original comment by dsalli...@gmail.com on 3 Nov 2009 at 6:43

GoogleCodeExporter commented 8 years ago
Dustin,

thanks for your reply, I will check that right now.

Regards, Daniel

Original comment by daniel.h...@gmail.com on 4 Nov 2009 at 9:12

GoogleCodeExporter commented 8 years ago
I changed the source code (yes you were right, I changed String.format to
MessageFormat.format) and removed the "\r\n", but I still got a
ConcurrentModificationException.

What's interesting, is that the ConcurrentModificationException is not thrown 
when I
use a different memcached which has been setup for unit testing purposes.

To provide some more information on the problem I checked the stats using 
telnet, the
output on the test memcached is:

stats
STAT pid 2283
STAT uptime 5012317
STAT time 1257326992
STAT version 1.2.8
STAT pointer_size 32
STAT rusage_user 0.208013
STAT rusage_system 0.616038
STAT curr_items 27
STAT total_items 80297
STAT bytes 23247
STAT curr_connections 34
STAT total_connections 1521
STAT connection_structures 77
STAT cmd_flush 689
STAT cmd_get 114120
STAT cmd_set 80313
STAT get_hits 108432
STAT get_misses 5688
STAT evictions 0
STAT bytes_read 18917757
STAT bytes_written 2431540314
STAT limit_maxbytes 104857600
STAT threads 17
STAT accepting_conns 1
STAT listen_disabled_num 0
END

The stats output on the other memcached where the exception occurs is:

stats
STAT pid 6116
STAT uptime 10702239
STAT time 1257327020
STAT version 1.2.8
STAT pointer_size 32
STAT rusage_user 1325.510491
STAT rusage_system 3474.411809
STAT curr_items 1277338
STAT total_items 62881699
STAT bytes 749187030
STAT curr_connections 10
STAT total_connections 84072
STAT connection_structures 125
STAT cmd_flush 106535
STAT cmd_get 92515662
STAT cmd_set 63039321
STAT get_hits 64284880
STAT get_misses 28230782
STAT evictions 0
STAT bytes_read 61552848579
STAT bytes_written 969344760417
STAT limit_maxbytes 1048576000
STAT threads 5
STAT accepting_conns 1
STAT listen_disabled_num 0
END

As I saw the uptime was much higher and many other values including rusage_* and
bytes were also much higher. I restarted the memcached, after restart the stats
output was:

stats
STAT pid 13607
STAT uptime 14
STAT time 1257327367
STAT version 1.2.8
STAT pointer_size 32
STAT rusage_user 0.000000
STAT rusage_system 0.001999
STAT curr_items 0
STAT total_items 0
STAT bytes 0
STAT curr_connections 10
STAT total_connections 11
STAT connection_structures 11
STAT cmd_flush 0
STAT cmd_get 0
STAT cmd_set 0
STAT get_hits 0
STAT get_misses 0
STAT evictions 0
STAT bytes_read 7
STAT bytes_written 0
STAT limit_maxbytes 1048576000
STAT threads 5
STAT accepting_conns 1
STAT listen_disabled_num 0
END

After the restart of memcached the ConcurrentModificationException was not 
thrown any
more.

I believe throwing ConcurrentModificationException has to do with a very 
populated
memcached with many items, where the retrieval of the stats cachedump takes a 
very
long time, so that within the method it will modify the Map after the result is 
returned.

I do not know how to reproduce this phenomenon, but I have best hope that you 
will
find the cause :-)

Thanks in advance, Daniel

Original comment by daniel.h...@gmail.com on 4 Nov 2009 at 10:05

GoogleCodeExporter commented 8 years ago
Reopening for further investigation.

Original comment by dsalli...@gmail.com on 9 Nov 2009 at 5:00

GoogleCodeExporter commented 8 years ago
I have a slab with a bit over 10,000 keys in it and cannot reproduce this 
issue.  In
order to solve this problem, I'm going to need an exact failing case.

From reading the code and trying things you're describing, I can't make a 
failure occur.

Original comment by dsalli...@gmail.com on 9 Nov 2009 at 6:57

GoogleCodeExporter commented 8 years ago
I am seeing this as well. I think this would happen if a stat response comes 
back after the operation timeout, and the stat map is returned to and used by 
the caller of getStats at the same time as StatsOperation.Callback.gotStat gets 
called. 
Most likely, the map that gets returned, as well as the per address maps in it 
need to be made into ConcurrentHashMaps. 

Original comment by Paul.Bur...@gmail.com on 2 Nov 2010 at 6:59