leeoo / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

Very big performance issue with large numbers of asyncGets #316

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What version of the product are you using? On what operating system?
spymemcached 2.10.3 . Run it on a cluster of 8 machines, each with 16 2-thread 
cores and 130GB RAM running Fedora.

Tell me more...

Hi,

Regarding this issue: 
http://code.google.com/p/spymemcached/issues/detail?id=125 , I tried the 
following code with Apache Spark on the above mentioned cluster using 64 
threads, 64 parallel workers (sorry the code is in Scala, but it's mostly the 
same syntax as in Java):

    def run() = {
      val x = Math.random()
        val memcache = new MemcachedClient(
            new BinaryConnectionFactory(),
            AddrUtil.getAddresses("host1:11211 host2:11211 host3:11211 host4:11211 host5:11211 host6:11211 host7:11211 host8:11211"));  
        val time1 = System.nanoTime();
        val getRequests = new HashMap[String, Future[Object]]
        for (i <- 1 until 1000000) {
           getRequests += (((x + i).toString, memcache.asyncGet(Integer.toString(i))))
        }
        val time2 = System.nanoTime();
        System.out.println("Time to add data = " + ((time2 - time1) / 1000000.0) + "ms");

        for (i <- 1 until 1000000) {
          val f = getRequests.get((x + i).toString).get 
          var v : Object = null
          try {
            v = f.get(3600, TimeUnit.SECONDS)
          } catch {
            case e : Exception => {
              f.cancel(true)
              System.err.println("MEMCACHED ERROR GET BULK!!! size requests = " + getRequests.size + " " + e.getMessage())
            }
          }        
        }
        //memcache.waitForQueues(86400, TimeUnit.SECONDS);
        val time3 = System.nanoTime();
        System.out.println("Time for queues to drain = " + ((time3 - time2) / 1000000.0) + "ms");

    }

    val sc = new SparkContext(conf)

    sc.parallelize((1 to 100000).toList, 64).mapPartitions(f => {run(); Iterator(1)} ).reduce(_ + _)

The result is that for most (if not all) of the get requests the following 
error is shown:
MEMCACHED ERROR GET BULK!!! size requests = 999999 
net.spy.memcached.internal.CheckedOperationTimeoutException: Operation timed out

The code above works fine if it is ran on one single machine. 

Can someone tell me if this is a bug ? 

Many thanks,

Octavian

Original issue reported on code.google.com by oooctav...@gmail.com on 22 Jan 2015 at 12:06