sleyzerzon / spymemcached

Automatically exported from code.google.com/p/spymemcached
0 stars 0 forks source link

Immediate reconnection failure on localhost prevents further reconnection attempts #31

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I'm using version 2.1.0 of the client on Solaris 10 x86, JVM 1.6.0_02. I have a 
test harness for 
reconnection behavior in my app; it starts memcached, starts a thread to issue 
dummy gets and 
sets, waits a little while, kills the memcached, waits one second, starts 
another memcached on 
the same port, and checks whether MemcachedClient#getUnavailableServers is 
empty (there is a 
single server per MemcachedClient instance). It's not working, even when I run 
continuous traffic 
against the client for 60 seconds, 58 or so of which are after restarting 
memcached. It always 
shows the server as being unavailable, no matter how long I wait.

 From turning on the debug logs, I see one reconnection attempt after 100 ms; this fails in 
MemcachedConnection#attemptReconnects, line 376, at the SocketChannel#connect 
call, with java.net.ConnectException: Connection refused. Not surprising, since 
the new memcached isn't 
started yet. However, it doesn't seem to queue a new reconnection attempt in 
response to this 
error: subsequent log messages show "Selecting with delay of 0ms", and I don't 
see any further 
reconnection attempts. 

Stack trace:

java.net.ConnectException: Connection refused
at sun.nio.ch.Net.connect(Native Method)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:464)
at 
net.spy.memcached.MemcachedConnection.attemptReconnects(MemcachedConnection.java
:376)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:188)
at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:1259)

I've attached the log4j log, too.

Looking through the attemptReconnects method, it's clearly an async call to 
SocketChannel#connect, which as the Javadocs point out, can return immediately: 
"If the 
connection is established immediately, as can happen with a local connection, 
then this method 
returns true". Indeed, attemptReconnects handles the success case, but there is 
no exception 
handler at this point to call queueReconnect, as there is in 
handleIO(SelectionKey). It looks like 
one needs to be added. I am running memcached on the same machine and 
connecting to 
127.0.0.1, so it makes sense that it would be taking the 'local connection' 
path.

I'll be happy to work up a patch, but it might take me a little while to get 
this Buildr thing (that is 
how it's built, isn't it?) going in my environment.

Original issue reported on code.google.com by cswhee...@gmail.com on 5 Jun 2008 at 1:49

Attachments:

GoogleCodeExporter commented 8 years ago

Original comment by dsalli...@gmail.com on 2 Oct 2008 at 7:08

GoogleCodeExporter commented 8 years ago
I believe I've fixed this, but it's a bit racy.  Let me know if you see this 
again in
2.2.

Original comment by dsalli...@gmail.com on 3 Oct 2008 at 4:41