soabase / exhibitor

ZooKeeper co-process for instance monitoring, backup/recovery, cleanup and visualization.
https://groups.google.com/forum/#!topic/exhibitor-users/PVkcd88mk8c
Apache License 2.0
1.68k stars 444 forks source link

Status indication issues and auto-restart problems #105

Closed fluential closed 11 years ago

fluential commented 11 years ago

Referring to #97

1) the --prefspath which has to be a file NEEDs to exist otherwise preferences won't be stored, in other words I think file should be created automatically if NONEXISTENT. 2) When zookeeper is running and I restart exhibitor the state is not recognized properly (stalled at latent) and I assume it should as exhibitor keeps asking zookeeper server "ruok" . Only committing changes to the config resets the state and only then I am able to see in the logs:

INFO com.netflix.exhibitor.core.activity.ActivityLog State: serving [ActivityQueue-0]

3) When I am able to successfully store preferences via --prefspath and auto-restarts are enabled, when zookeeper is stopped it never gets restarted. Again state is stalled at "serving". Only when pressing commit config changes it tries to restart zookeeper instance, it does not detect dead zookeeper.

OS: Ubuntu 12.04 java version "1.6.0_33"

Randgalt commented 11 years ago

Can you provide the Exhibitor logs? I'll bet there are exceptions.

fluential commented 11 years ago

Right, so I just started exhibitor while zookeeper was running: 2013-01-10 11:34:49,678 - INFO [main:ActivityLog$Type$2@128] - Exhibitor started 2013-01-10 11:34:51,073 - INFO [main:Slf4jLog@67] - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2013-01-10 11:34:51,109 - INFO [main:Slf4jLog@67] - jetty-1.4.4 Jan 10, 2013 11:35:11 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFO: Initiating Jersey application, version 'Jersey: 1.9.1 09/14/2011 02:36 PM' 2013-01-10 11:35:12,075 - INFO [main:Slf4jLog@67] - Started SocketConnector@0.0.0.0:8080 The status is being stalled at latent, I enable config editing and hit commit button and I see in the logs: 2013-01-10 11:36:01,196 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - State: serving

Now I do:

bin/zkServer.sh status
Using config: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg
Mode: follower

bin/zkServer.sh stop
Using config: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED

bin/zkServer.sh status
Error contacting service. It is probably not running.

Nothing is happening in the exhibitor, the prefs file looks like that:

Auto-generated by com.netflix.exhibitor.core.controlpanel.FileBasedPreferences

Wed Jan 09 11:57:29 UTC 2013

com.netflix.exhibitor.control-panel.cleanup=false com.netflix.exhibitor.control-panel.restarts=true

I still see status as "serving", so I do the usual - refresh state via config commit (no actual changes are being made) Now I see:

2013-01-10 11:40:43,235 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - State: down 2013-01-10 11:40:43,238 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - Attempting to stop instance 2013-01-10 11:40:43,240 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - Attempting to start/restart ZooKeeper 2013-01-10 11:40:43,577 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - jps didn't find instance - assuming ZK is not running 2013-01-10 11:40:43,590 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - ZooKeeper Server: JMX enabled by default 2013-01-10 11:40:43,594 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - Process started via: /opt/zookeeper/zookeeper-3.3.6/bin/zkServer.sh 2013-01-10 11:40:43,602 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - ZooKeeper Server: Using config: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg 2013-01-10 11:40:44,618 - INFO [pool-3-thread-2:ActivityLog$Type$2@128] - ZooKeeper Server: Starting zookeeper ... STARTED 2013-01-10 11:40:51,754 - INFO [ActivityQueue-0:ActivityLog$Type$2@128] - State: serving

Randgalt commented 11 years ago

Do you have the various config settings set up on the Exhibitor Config tag? Exhibitor will stay at Latent until those are set.

edit: you probably do given the log above.

Randgalt commented 11 years ago

I can't reproduce this. I just tried with a non-existent prefs file and it all works correctly. Any other details you can provide would be appreciated.

fluential commented 11 years ago

That is weird, here are more details:

Zookeeper: /opt/zookeeper/zookeeper-3.3.6

From exhibitor UI: Zookeeper install dir: /opt/zookeeper/* Zookeeper snapshot dir: /data/zookeeper/data (they are all exist and are fine) Zookeeper transaction dir: /data/zookeeper/log -- | | --- Additional config:

tickTime=2000 initLimit=10 syncLimit=5 leaderServes=yes maxClientCnxns=300

log4j.properties:

log4j.rootLogger=DEBUG, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n

Client port: 2181 Connect port: 2888 Election port: 3888

Can you let me know which version of zookeeper you guys are using? I will try tomorrow with the latest stable, but eventually I need to have two different clusters with 3.3.x (required by Storm) and 3.4.x respectively.

Randgalt commented 11 years ago

We're on 3.4.5 currently. But, I don't know of any changes that would make it not work in 3.3.x. Please send the Exhibitor log file from disk as well as the Zookeeper log file. Maybe something will show there.

fluential commented 11 years ago

In the meantime this is a quick snippet how I am setting things up: https://gist.github.com/4540052 I will followup with some debug logs from both applications in a while. Thanks for help!

fluential commented 11 years ago

Right, I have two cases: 1) Zookeeper is down, I just start exhibitor server that reads previously saved configuration (details can be seen in the logs provided) and nothing really happens: state is latent, zookeeper keeps being down - http://kukabara.mx/logs2go/exhibitor_out.txt 2) Zokeeper is down, I am starting exhibitor with a new non-existent s3 config file, I configure in UI only zookeeper path, ports and data/logs directories, I commit the changes zookeeper is getting up, I stop zookeeper and its keep being down until the moment I enforce commit changes (actually nothing was changed) and the zookeeper is being picked up as down and gets restarted. - http://kukabara.mx/logs2go/exhibitor_out2.txt

In the meantime I am accessing exhibitor/v1/cluster/state/localhost via browser to check what is being displayed.

Any ideas how further I can help to debug it? Also please refer to https://gist.github.com/4540052 as a way of setting the environment up, it is very weird that you cannot really replicate the problem.

Randgalt commented 11 years ago

Please provide the ZooKeeper logs if you can.

fluential commented 11 years ago

Well as I mentioned above, zookeeper is not even tried to be started, but when I commit the config changes it does. Exhibitor logs - note around 11:39:06 I am pushing the config and the state is getting updated.

1.4.4
2013-01-16 11:37:51,169 - INFO  [main:ActivityLog$Type$2@128] - Exhibitor started
2013-01-16 11:37:52,841 - INFO  [main:Slf4jLog@67] - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-01-16 11:37:52,875 - INFO  [main:Slf4jLog@67] - jetty-1.4.4
Jan 16, 2013 11:37:53 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9.1 09/14/2011 02:36 PM'
2013-01-16 11:37:54,095 - INFO  [main:Slf4jLog@67] - Started SocketConnector@0.0.0.0:8080
2013-01-16 11:39:06,700 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - State: down
2013-01-16 11:39:06,702 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - Attempting to stop instance
2013-01-16 11:39:06,702 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - Attempting to start/restart ZooKeeper
2013-01-16 11:39:07,141 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task: Exception in thread "main" java.lang.IllegalArgumentException: count should be greater than 3
2013-01-16 11:39:07,142 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task:        at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:64)
2013-01-16 11:39:07,142 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task:        at org.apache.zookeeper.server.PurgeTxnLog.main(PurgeTxnLog.java:130)
2013-01-16 11:39:07,146 - INFO  [pool-3-thread-2:ActivityLog$Type$2@128] - Cleanup task completed
2013-01-16 11:39:07,344 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - jps didn't find instance - assuming ZK is not running
2013-01-16 11:39:07,358 - ERROR [pool-3-thread-2:ActivityLog$Type$1@106] - ZooKeeper Server: JMX enabled by default
2013-01-16 11:39:07,358 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - Process started via: /opt/zookeeper/zookeeper-3.3.6/bin/zkServer.sh
2013-01-16 11:39:07,398 - ERROR [pool-3-thread-2:ActivityLog$Type$1@106] - ZooKeeper Server: Using config: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg
2013-01-16 11:39:08,420 - INFO  [pool-3-thread-1:ActivityLog$Type$2@128] - ZooKeeper Server: Starting zookeeper ... STARTED
2013-01-16 11:39:23,175 - INFO  [ActivityQueue-0:ActivityLog$Type$2@128] - State: serving
2013-01-16 11:39:23,354 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task: Exception in thread "main" java.lang.IllegalArgumentException: count should be greater than 3
2013-01-16 11:39:23,355 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task:        at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:64)
2013-01-16 11:39:23,355 - ERROR [pool-3-thread-1:ActivityLog$Type$1@106] - Cleanup Task:        at org.apache.zookeeper.server.PurgeTxnLog.main(PurgeTxnLog.java:130)
2013-01-16 11:39:23,358 - INFO  [pool-3-thread-3:ActivityLog$Type$2@128] - Cleanup task completed

Zookeeper logs:

2013-01-16 11:39:08,099 - INFO  [main:QuorumPeerConfig@90] - Reading configuration from: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg
2013-01-16 11:39:08,120 - ERROR [main:QuorumPeerConfig@259] - Invalid configuration, only one server specified (ignoring)
2013-01-16 11:39:08,120 - WARN  [main:QuorumPeerMain@105] - Either no config or no quorum defined in config, running  in standalone mode
2013-01-16 11:39:08,139 - INFO  [main:QuorumPeerConfig@90] - Reading configuration from: /opt/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg
2013-01-16 11:39:08,145 - ERROR [main:QuorumPeerConfig@259] - Invalid configuration, only one server specified (ignoring)
2013-01-16 11:39:08,145 - INFO  [main:ZooKeeperServerMain@94] - Starting server
2013-01-16 11:39:08,165 - INFO  [main:Environment@97] - Server environment:zookeeper.version=3.3.6-1366786, built on 07/29/2012 06:22 GMT
2013-01-16 11:39:08,166 - INFO  [main:Environment@97] - Server environment:host.name=ip-10-59-42-39.eu-west-1.compute.internal
2013-01-16 11:39:08,166 - INFO  [main:Environment@97] - Server environment:java.version=1.6.0_33
2013-01-16 11:39:08,167 - INFO  [main:Environment@97] - Server environment:java.vendor=Sun Microsystems Inc.
2013-01-16 11:39:08,167 - INFO  [main:Environment@97] - Server environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.33/jre
2013-01-16 11:39:08,167 - INFO  [main:Environment@97] - Server environment:java.class.path=/opt/zookeeper/zookeeper-3.3.6/bin/../build/classes:/opt/zookeeper/zookeeper-3.3.6/bin/../build/lib/*.jar:/opt/zookeeper/zookeeper-3.3.6/bin/../zookeeper-3.3.6.jar:/opt/zookeeper/zookeeper-3.3.6/bin/../lib/log4j-1.2.15.jar:/opt/zookeeper/zookeeper-3.3.6/bin/../lib/jline-0.9.94.jar:/opt/zookeeper/zookeeper-3.3.6/bin/../src/java/lib/*.jar:/opt/zookeeper/zookeeper-3.3.6/bin/../conf:
2013-01-16 11:39:08,168 - INFO  [main:Environment@97] - Server environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.33/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.33/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.33/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2013-01-16 11:39:08,168 - INFO  [main:Environment@97] - Server environment:java.io.tmpdir=/tmp
2013-01-16 11:39:08,169 - INFO  [main:Environment@97] - Server environment:java.compiler=<NA>
2013-01-16 11:39:08,169 - INFO  [main:Environment@97] - Server environment:os.name=Linux
2013-01-16 11:39:08,176 - INFO  [main:Environment@97] - Server environment:os.arch=amd64
2013-01-16 11:39:08,177 - INFO  [main:Environment@97] - Server environment:os.version=3.2.0-35-virtual
2013-01-16 11:39:08,177 - INFO  [main:Environment@97] - Server environment:user.name=root
2013-01-16 11:39:08,178 - INFO  [main:Environment@97] - Server environment:user.home=/root
2013-01-16 11:39:08,178 - INFO  [main:Environment@97] - Server environment:user.dir=/opt/zookeeper/zookeeper-3.3.6
2013-01-16 11:39:08,193 - INFO  [main:ZooKeeperServer@673] - tickTime set to 3000
2013-01-16 11:39:08,194 - INFO  [main:ZooKeeperServer@682] - minSessionTimeout set to -1
2013-01-16 11:39:08,194 - INFO  [main:ZooKeeperServer@691] - maxSessionTimeout set to -1
2013-01-16 11:39:08,230 - INFO  [main:NIOServerCnxn$Factory@143] - binding to port 0.0.0.0/0.0.0.0:2188
2013-01-16 11:39:08,252 - INFO  [main:FileSnap@82] - Reading snapshot /mnt/zookeeper/data/version-2/snapshot.0
2013-01-16 11:39:08,267 - INFO  [main:FileTxnSnapLog@256] - Snapshotting: 0
2013-01-16 11:39:10,478 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43514
2013-01-16 11:39:10,481 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43514
2013-01-16 11:39:10,483 - INFO  [Thread-1:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43514 (no session established for client)
2013-01-16 11:39:12,977 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43515
2013-01-16 11:39:12,977 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43515
2013-01-16 11:39:12,979 - INFO  [Thread-2:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43515 (no session established for client)
2013-01-16 11:39:15,477 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43516
2013-01-16 11:39:15,478 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43516
2013-01-16 11:39:15,479 - INFO  [Thread-3:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43516 (no session established for client)
2013-01-16 11:39:18,012 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43517
2013-01-16 11:39:18,012 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43517
2013-01-16 11:39:18,014 - INFO  [Thread-4:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43517 (no session established for client)
2013-01-16 11:39:20,479 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43518
2013-01-16 11:39:20,485 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43518
2013-01-16 11:39:20,488 - INFO  [Thread-5:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43518 (no session established for client)
2013-01-16 11:39:22,980 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43519
2013-01-16 11:39:22,981 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43519
2013-01-16 11:39:22,982 - INFO  [Thread-6:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43519 (no session established for client)
2013-01-16 11:39:23,149 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43521
2013-01-16 11:39:23,150 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43521
(...)
2013-01-16 11:39:42,986 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43530
2013-01-16 11:39:42,987 - INFO  [Thread-16:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43530 (no session established for client)
2013-01-16 11:39:43,055 - INFO  [main:FourLetterWordMain@43] - connecting to localhost 2188
2013-01-16 11:39:43,091 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43532
2013-01-16 11:39:43,094 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing srvr command from /127.0.0.1:43532
2013-01-16 11:39:43,095 - INFO  [Thread-17:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43532 (no session established for client)
2013-01-16 11:39:45,493 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43533
2013-01-16 11:39:45,494 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43533
2013-01-16 11:39:45,495 - INFO  [Thread-18:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43533 (no session established for client)
2013-01-16 11:39:47,987 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43534
2013-01-16 11:39:47,988 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43534
2013-01-16 11:39:47,989 - INFO  [Thread-19:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43534 (no session established for client)
2013-01-16 11:39:50,491 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43535
2013-01-16 11:39:50,492 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43535
2013-01-16 11:39:50,493 - INFO  [Thread-20:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43535 (no session established for client)
2013-01-16 11:39:52,990 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43536
2013-01-16 11:39:52,991 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43536
2013-01-16 11:39:52,992 - INFO  [Thread-21:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43536 (no session established for client)
2013-01-16 11:39:55,495 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43538
2013-01-16 11:39:55,496 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43538
2013-01-16 11:39:55,497 - INFO  [Thread-22:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43538 (no session established for client)
2013-01-16 11:39:57,990 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43539
2013-01-16 11:39:57,991 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43539
2013-01-16 11:39:57,991 - INFO  [Thread-23:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43539 (no session established for client)
2013-01-16 11:40:00,499 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43540
2013-01-16 11:40:00,500 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43540
2013-01-16 11:40:00,501 - INFO  [Thread-24:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43540 (no session established for client)
2013-01-16 11:40:02,996 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43541
2013-01-16 11:40:02,996 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43541
2013-01-16 11:40:02,997 - INFO  [Thread-25:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43541 (no session established for client)
2013-01-16 11:40:05,496 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43542
2013-01-16 11:40:05,497 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43542
2013-01-16 11:40:05,498 - INFO  [Thread-26:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43542 (no session established for client)
2013-01-16 11:40:07,993 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43543
2013-01-16 11:40:07,994 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43543
2013-01-16 11:40:07,995 - INFO  [Thread-27:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43543 (no session established for client)
2013-01-16 11:40:10,500 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43544
2013-01-16 11:40:10,501 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing ruok command from /127.0.0.1:43544
2013-01-16 11:40:10,502 - INFO  [Thread-28:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43544 (no session established for client)
2013-01-16 11:40:10,759 - INFO  [main:FourLetterWordMain@43] - connecting to localhost 2188
2013-01-16 11:40:10,796 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:43546
2013-01-16 11:40:10,798 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2188:NIOServerCnxn@1237] - Processing srvr command from /127.0.0.1:43546
2013-01-16 11:40:10,799 - INFO  [Thread-29:NIOServerCnxn@1435] - Closed socket connection for client /127.0.0.1:43546 (no session established for client)

At this stage I stop the zookeeper and exhibitor does not pick up any new state showing zookeeper as serving. Exhibitor's debug logs here: http://kukabara.mx/logs2go/exhibitor_out3.txt

INFO [ActivityQueue-0:ActivityLog$Type$2@128] - State: serving - shows up only when I commit config changes. (not even when I press restart button on the UI - zookeeper gets restarted but state keeps being stalled/latent)

On zookeeper 3.4.5 all seems to be the same which is fair enough.

Ideas how to debug further?

jeekajoo commented 11 years ago

Same bugs for me

My configuration :

Tell me if I can help by providing some more logs.

Randgalt commented 11 years ago

@jeekajoo - Are you using "--prefspath" as well?

fluential commented 11 years ago

I don't think this is related to prefspath, the only reason I was using it is that without I was having some more weird problems with UI (AFAIR) but I will check tomorrow to confirm and let you guys know.

jeekajoo commented 11 years ago

bugs occur with and without --prefspath

jeekajoo commented 11 years ago

Live Check was set to "0". That's why it was not working for me.

@Randgalt Maybe it would handy to have default values higher than 0.

fluential commented 11 years ago

Right, that seems like the same case for me. Funny. Closing the ticket.

Randgalt commented 11 years ago

Ah!!! Good catch. At minimum, the log should warn that this is 0. I'm going to reopen this as it's a kind of UI bug.

fluential commented 11 years ago

Lets create prefsfile if it does not exist as well please otherwise I am not sure what happens to the settings.

Randgalt commented 11 years ago

This will be fixed in the next release (creating the Prefs file isn't necessary - I've double checked this).