GiraffaFS / giraffa

Giraffa FileSystem (Slack: giraffa-fs.slack.com)
https://giraffa.ci.cloudbees.com
Apache License 2.0
17 stars 6 forks source link

Fix dangling Connections in client and unit tests #158

Closed pjeli closed 9 years ago

pjeli commented 9 years ago

NamespaceAgent, INodeManager, and GiraffaWebObserver are leaking HBase Connection objects which are accumulating when running unit tests and causing failures.

shvachko commented 9 years ago

This fixes some of the exceptions, but I still see a lot of ConnectException in the following context:

15/05/27 12:14:58 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:60213. Will not attempt to authenticate using SASL (unknown error)
15/05/27 12:14:58 WARN zookeeper.ClientCnxn: Session 0x14d96c950fa000d for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

Where Session 0x14d96c950fa000d was opened some unit tests ago, but never closed.

pjeli commented 9 years ago

Try now. Added ZkCluster into HBaseTestingUtil on cluster creation. I think there is a bug in HBaseTestingUtil.startMiniCluster -- it does not register the MiniZKCluster to the MiniHBaseCluster and therefore does not shut it down on cluster.shutdown().

We can register it manually though.

shvachko commented 9 years ago

Latest revision worked for me. I do not see ConnectException. Let's think how to wrap it in a single method though.

pjeli commented 9 years ago

So the reason that last one worked for you was because the ZK servers of old runs were never being killed so the clients still had something to connect to. I had the wrong fix basically.

I've since then found the problem (as we talked offline). The last problem was in NamespaceAgent.format(); there was a dangling Connection that was never closed.

Check the latest pull request. All tests passed on my machine on both CLI and IntelliJ.

shvachko commented 9 years ago

Plamen, good gob getting to the bottom of it! Looks like you nailed all connections now. One thing I didn't like, that you made INodeManager manage Connection for tests. So I changed it to one of your previous variants, when tests create and close their own connections. No changes INodeManager then. I also think we don't need the CoprocessorEnvironment in INodeManager, but it's a subject of another issue. Could you please run the tests, check if that works for you.

pjeli commented 9 years ago

Thanks Konst. All's working now. Committed: 39f6f166f4ef169890280e95b890080bbe703542.