nirmata / workflow

A ZooKeeper and Curator based distributed workflow management library that enables distributed task workflows.
http://nirmata.github.io/workflow
Apache License 2.0
96 stars 48 forks source link

workflow client or/and curator client loose connection to zookeeper #3

Closed dtoledo67 closed 9 years ago

dtoledo67 commented 9 years ago

The exception seen on the zookeeper side are:

error 5880416 2014-12-17 01:49:00,688 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38118 which had sessionid 0x34a4c0530500002 2014-12-17 01:49:02,424 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.10.64.50:38123 2014-12-17 01:49:02,431 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x34a4c0530500002 at /10.10.64.50:38123 2014-12-17 01:49:02,431 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating client: 0x34a4c0530500002 2014-12-17 01:49:02,431 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@617] - Established session 0x34a4c0530500002 with negotiated timeout 40000 for client /10.10.64.50:38123 2014-12-17 01:49:02,509 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x34a4c0530500002 due to java.io.IOException: Len error 5880416 2014-12-17 01:49:02,509 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38123 which had sessionid 0x34a4c0530500002 2014-12-17 01:49:04,619 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.10.64.50:38133 2014-12-17 01:49:04,626 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x34a4c0530500002 at /10.10.64.50:38133 2014-12-17 01:49:04,626 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating client: 0x34a4c0530500002 2014-12-17 01:49:04,627 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@617] - Established session 0x34a4c0530500002 with negotiated timeout 40000 for client /10.10.64.50:38133 2014-12-17 01:49:04,697 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x34a4c0530500002 due to java.io.IOException: Len error 5880416 2014-12-17 01:49:04,697 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38133 which had sessionid 0x34a4c0530500002 2014-12-17 01:49:06,963 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.10.64.50:38139 2014-12-17 01:49:06,971 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x34a4c0530500002 at /10.10.64.50:38139 2014-12-17 01:49:06,971 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating client: 0x34a4c0530500002 2014-12-17 01:49:06,972 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@617] - Established session 0x34a4c0530500002 with negotiated timeout 40000 for client /10.10.64.50:38139 2014-12-17 01:49:07,040 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x34a4c0530500002 due to java.io.IOException: Len error 5880416 2014-12-17 01:49:07,040 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38139 which had sessionid 0x34a4c0530500002 2014-12-17 01:49:08,762 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.10.64.50:38143 2014-12-17 01:49:08,770 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x34a4c0530500002 at /10.10.64.50:38143 2014-12-17 01:49:08,770 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating client: 0x34a4c0530500002 2014-12-17 01:49:08,771 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@617] - Established session 0x34a4c0530500002 with negotiated timeout 40000 for client /10.10.64.50:38143 2014-12-17 01:49:08,841 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x34a4c0530500002 due to java.io.IOException: Len error 5880416 2014-12-17 01:49:08,842 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38143 which had sessionid 0x34a4c0530500002 2014-12-17 01:49:11,530 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /10.10.64.50:38156 2014-12-17 01:49:11,538 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@861] - Client attempting to renew session 0x34a4c0530500002 at /10.10.64.50:38156 2014-12-17 01:49:11,538 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@108] - Revalidating client: 0x34a4c0530500002 2014-12-17 01:49:11,538 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@617] - Established session 0x34a4c0530500002 with negotiated timeout 40000 for client /10.10.64.50:38156 2014-12-17 01:49:11,608 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x34a4c0530500002 due to java.io.IOException: Len error 5880416 2014-12-17 01:49:11,608 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /10.10.64.50:38156 which had sessionid 0x34a4c0530500002

Randgalt commented 9 years ago

According to this - http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting - client swapping can cause this issue. Let's see if it continues once #1 is fixed.

Randgalt commented 9 years ago

Can this be closed?

dtoledo67 commented 9 years ago

Yes, it is working as expected.

Thanks,

Damien

Damien Toledo Co-founder, VP Engineering Nirmata http://www.nirmata.com

On Wed, Apr 1, 2015 at 3:08 PM, Jordan Zimmerman notifications@github.com wrote:

Can this be closed?

— Reply to this email directly or view it on GitHub https://github.com/NirmataOSS/workflow/issues/3#issuecomment-88647270.