wurstmeister / storm-docker

Dockerfiles for building a storm cluster.
Apache License 2.0
230 stars 169 forks source link

Intermittent connection refused errors #8

Open toboid opened 9 years ago

toboid commented 9 years ago

I have started getting the following error when running StormDocker. It was working fine, now i've started getting the following error in StormUI. It works intermittently if i refresh the page a couple of times.

I'm seeing similar errors in the supervisor log file.

I've tried re-pulling all of the images down and restarting the docker service but this issue keeps coming and going whatever I do.

I'm running on Ubuntu 12.04 64bit. Any ideas?

org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift7.transport.TSocket.open(TSocket.java:183) at org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81) at backtype.storm.thrift$nimbus_client_and_conn.invoke(thrift.clj:75) at backtype.storm.ui.core$cluster_summary.invoke(core.clj:455) at backtype.storm.ui.core$fn8223.invoke(core.clj:789) at compojure.core$make_route$fn__3365.invoke(core.clj:93) at compojure.core$if_route$fn3353.invoke(core.clj:39) at compojure.core$if_method$fn3346.invoke(core.clj:24) at compojure.core$routing$fn3371.invoke(core.clj:106) at clojure.core$some.invoke(core.clj:2443) at compojure.core$routing.doInvoke(core.clj:106) at clojure.lang.RestFn.applyTo(RestFn.java:139) at clojure.core$apply.invoke(core.clj:619) at compojure.core$routes$fn3375.invoke(core.clj:111) at ring.middleware.reload$wrap_reload$fn__7540.invoke(reload.clj:14) at backtype.storm.ui.core$catch_errors$fn8268.invoke(core.clj:858) at ring.middleware.keyword_params$wrap_keyword_params$fn4029.invoke(keyword_params.clj:27) at ring.middleware.nested_params$wrap_nested_params$fn4068.invoke(nested_params.clj:65) at ring.middleware.params$wrap_params$fn4001.invoke(params.clj:55) at ring.middleware.multipart_params$wrap_multipart_params$fn4096.invoke(multipart_params.clj:103) at ring.middleware.flash$wrap_flash$fn4277.invoke(flash.clj:14) at ring.middleware.session$wrap_session$fn4266.invoke(session.clj:43) at ring.middleware.cookies$wrap_cookies$fn__4197.invoke(cookies.clj:160) at ring.adapter.jetty$proxy_handler$fn__7179.invoke(jetty.clj:16) at ring.adapter.jetty.proxy$org.mortbay.jetty.handler.AbstractHandler$0.handle(Unknown Source) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385) at java.net.Socket.connect(Socket.java:546) at org.apache.thrift7.transport.TSocket.open(TSocket.java:178)

toboid commented 9 years ago

here is what i get in the supervisor.log

2014-11-30 15:55:01 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting 2014-11-30 15:55:01 o.a.z.ZooKeeper [INFO] Initiating client connection, connectString= sessionTimeout=20000 watcher=org.apache.curator.ConnectionState@ 2014-11-30 15:55:01 o.a.z.ClientCnxn [INFO] Opening socket connection to server stormdocker_zookeeper_1/. Will not attempt to authenticate using SASL (unknown error) 2014-11-30 15:55:01 o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.6.0_32] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:601) ~[na:1.6.0_32] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[zookeeper-3.4.5.jar:3.4.5-1392090] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) ~[zookeeper-3.4.5.jar:3.4.5-1392090]

igorfrenkel commented 9 years ago

+1. I'm also seeing these. Intermittently.

toboid commented 9 years ago

Restarting the box seems to clear it for me.

sudeepd commented 9 years ago

The nimbus (and others) IP changes when you restart the containers. The UI 's storm.yaml still points to the old nimbus IP.

lq08025107 commented 8 years ago

got the same error, not Intermittenly, always

lq08025107 commented 8 years ago

In my case, nimbus's storm.yaml's zookeeper address is 172.17.0.1, but you know that is wrong, this address is the docker0's address, I don't know why.........