magro / memcached-session-manager

A tomcat session manager that backups sessions in memcached and pulls them from there if asked for unknown sessions
Apache License 2.0
759 stars 348 forks source link

DNS lookup of memcachedNodes #302

Open gilesw opened 8 years ago

gilesw commented 8 years ago

We have a AWS setup with:-

elasticache.our-domain CNAME

and memcachedNodes="elasticache.our-domain"

We've got a low ttl on the CNAME records and the JVM Java8 is running with -Dsun.net.inetaddr.ttl=60. However if we rebuild the elasticache with a new CNAME this isn't picked up despite the box resolving the new address.

I've sniffed DNS traffic and not seen anything come through. However I do see a lookup on startup.

magro commented 8 years ago

I think part of the issue is that we're creating the InetSocketAddress via new InetSocketAddress( hostname, port ) here, which performs the name lookup on creation (via InetAddress.getByName(hostname)) - no chance that this will be reresolved again. This resolved InetSocketAddress is then passed to the memcached client which uses it internally (here's a related spymemcached issue). Because of the latter, we'd probably have to regularly check for changed name resolution results and in this case replace the memcached client. Perhaps looking deeper into the spymemcached code shows other possibilities.

magro commented 8 years ago

Btw, which memcached client are you using, spymemcached, couchbase or elasticache?

gilesw commented 8 years ago

Heya Magro,

We are using these jars:-

AmazonElastiCacheClusterClient-1.1.0.jar
memcached-session-manager-1.9.1.jar
memcached-session-manager-tc8-1.9.1.jar

and this config.

<?xml version="1.0" encoding="UTF-8"?>
<Context>
  <WatchedResource>WEB-INF/web.xml</WatchedResource>
  <Manager className="de.javakaffee.web.msm.MemcachedBackupSessionManager"
     storageKeyPrefix="static:${aws.stackName}_XXX"
     memcachedNodes="cache.ourdomain.com"
     memcachedProtocol="binary"
     sticky="false"
     sessionBackupAsync="false"
     sessionBackupTimeout="300"
     requestUriIgnorePattern=".*\.(gif|jpg|jpeg|png|wmv|avi|mpg|mpeg|mp4|htm|html|js|css|mp3|swf|ico|flv)$"
  />
</Context>

When running successfully on a 2 node cluster we get these log messages showing the endpoint switching between the two.

2016-06-15 04:57:32.516 INFO net.spy.memcached.ConfigurationPoller:  Starting configuration poller.
2016-06-15 04:57:32.516 INFO net.spy.memcached.ConfigurationPoller:  Endpoint to use for configuration access in this poll NodeEndPoint - HostName:xxxxxx.0001.usw1.cache.amazonaws.com IpAddress:10.x.x.88 Port:11211
2016-06-15 04:58:32.516 INFO net.spy.memcached.ConfigurationPoller:  Starting configuration poller.
2016-06-15 04:58:32.516 INFO net.spy.memcached.ConfigurationPoller:  Endpoint to use for configuration access in this poll NodeEndPoint - HostName:xxxx.0002.usw1.cache.amazonaws.com IpAddress:10.x.x.89 Port:11211

So it looks like we are using the auto-discovery mechanism.

gilesw commented 8 years ago

Interestingly digging around the autodiscovery mechanism there's code to recreate the final discovered sockets as they do change when cluster sizes increase:-

https://github.com/awslabs/aws-elasticache-cluster-client-memcached-for-java/blob/master/src/main/java/net/spy/memcached/ConfigurationPoller.java

      if(endpoints.isEmpty()){
        //If no nodes are available status, then get all the endpoints. This provides an 
        //oppurtunity to re-resolve the hostname by recreating InetSocketAddress instance in "NodeEndPoint".getInetSocketAddress().
        endpoints = client.getAllNodeEndPoints();
      }

but looking through the code samples you posted if that memcachedNodes socket is never re-tested it will always be the same ip:port combo. I'm surprised this isn't a problem for more people. I'm not a Java dev myself so not much I can help with. Where does this responsibilty lie?

gilesw commented 8 years ago

Heya Magro,

What do you think about using something like Haproxy as a local proxy between Tomcat and the memcached cluster?

magro commented 8 years ago

@gilesw Not sure if haproxy would work. I just checked if one could override InetSocketAddress to perform transparent reresolution once its InetAddress is accessed (which is called from SocketChannel.connect invoked by MemcachedConnection.attemptReconnects), but unfortunately this is not possible because everything is final (and even the equals method uses the internal InetSocketAddressHolder which would have to be patched as well) - therefore this seems not to be possible. Looking further at MemcachedConnection.attemptReconnects, in one case it uses node.getNodeEndPoint().getInetSocketAddress(true); to determine the InetSocketAddress, where true means reresolve. Therefore this client seems to support dns reresolution already - not sure why it's not working. AFAICS in stock spymemcached this is not the case (would have to be added there).

soualid commented 7 years ago

Same issue here, will try to dig the code to submit a PR if possible.