Open xiaoguanyu opened 10 years ago
This seems that you haven't install the zookeeper package on tank server.
installed the zookeeper package on tank server,but alert a new err Start task 0 of zookeeper on 10.38.11.59(0) fail: <Fault 60: 'ALREADY_STARTED: zookeeper--dptst--zookeeper'> ...... File "/usr/local/lib/python2.7/socket.py", line 571, in create_connection raise err socket.error: [Errno 111] Connection refused
Can you post the detailed stack trace?
2014-05-14 13:26:49 You should set a bootstrap password, it will be requried when you do cleanup
Set a password manually? (y/n) y
Please input your password:
2014-05-14 13:26:52 Your password is: 123456, you should store this in a safe place, because this is the verification code used to do cleanup
2014-05-14 13:26:52 Bootstrapping task 0 of zookeeper on 10.38.11.59(0)
2014-05-14 13:26:53 Bootstrap task 0 of zookeeper on 10.38.11.59(0) success
2014-05-14 13:26:53 Starting task 0 of zookeeper on 10.38.11.59(0)
2014-05-14 13:26:53 Start task 0 of zookeeper on 10.38.11.59(0) fail: <Fault 60: 'ALREADY_STARTED: zookeeper--dptst--zookeeper'>
Traceback (most recent call last):
File "/usr/local/test/minos/client/deploy.py", line 284, in
This seems that the minos client can't connect to your supervisord Can you check that whether your supervisord is started normally or not?
My supervisord is started normally,and can view components work status by http://192.169.11.59:9001.
From your error message, the client is trying to connect another ip, check that one?
Bootstrapping task 0 of zookeeper on 192.38.11.59(0) Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: No package found on package server of zookeeper Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: 2 Starting task 0 of zookeeper on 192.38.11.59(0)
thank you yehangjun ,the problem is solved,because haven't install the zookeeper package on tank server.
cd minos/client
./deploy install zookeeper dptst
Bootstrapping task 0 of zookeeper on 192.38.11.59(0) Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: No package found on package server of zookeeper Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: 2 Starting task 0 of zookeeper on 192.38.11.59(0)
@xiaoguanyu What is the root cause of the 'connection refused' error?
我也遇到了类似的情况:
[root@master client]# ./deploy bootstrap zookeeper dptst
2014-10-15 17:29:57 You should set a bootstrap password, it will be requried when you do cleanup
Set a password manually? (y/n) y
Please input your password:
2014-10-15 17:30:03 Your password is: ir2014, you should store this in a safe place, because this is the verification code used to do cleanup
2014-10-15 17:30:03 Bootstrapping task 0 of zookeeper on 10.161.156.199(0)
2014-10-15 17:30:07 Bootstrap task 0 of zookeeper on 10.161.156.199(0) success
2014-10-15 17:30:07 Starting task 0 of zookeeper on 10.161.156.199(0)
2014-10-15 17:30:07 Start task 0 of zookeeper on 10.161.156.199(0) success
Traceback (most recent call last):
File "/root/minos/client/deploy.py", line 288, in <module>
main()
File "/root/minos/client/deploy.py", line 285, in main
return args.handler(args)
File "/root/minos/client/deploy.py", line 233, in process_command_bootstrap
return deploy_tool.bootstrap(args)
File "/root/minos/client/deploy_zookeeper.py", line 154, in bootstrap
bootstrap_job(args, hosts[host_id].ip, "zookeeper", host_id, instance_id, cleanup_token)
File "/root/minos/client/deploy_zookeeper.py", line 136, in bootstrap_job
args.zookeeper_config.parse_generated_config_files(args, job_name, host_id, instance_id)
File "/root/minos/client/service_config.py", line 693, in parse_generated_config_files
args, self.service, self.cluster, self.jobs, current_job, host_id, instance_id))
File "/root/minos/client/service_config.py", line 681, in parse_generated_files
parsing_service, current_job, host_id, instance_id, value)
File "/root/minos/client/service_config.py", line 622, in parse_item
current_job, host_id, instance_id, reg_expr[iter]))
File "/root/minos/client/service_config.py", line 274, in get_section_attribute
section, section_instance_id, attribute)
File "/root/minos/client/service_config.py", line 195, in get_specific_dir
return supervisor_client.get_available_data_dirs()[0]
File "/root/minos/client/supervisor_client.py", line 26, in get_available_data_dirs
self.cluster, self.job)
File "/usr/local/python2.7/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/local/python2.7/lib/python2.7/xmlrpclib.py", line 1578, in __request
verbose=self.__verbose
File "/usr/local/python2.7/lib/python2.7/xmlrpclib.py", line 1264, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/local/python2.7/lib/python2.7/xmlrpclib.py", line 1292, in single_request
self.send_content(h, request_body)
File "/usr/local/python2.7/lib/python2.7/xmlrpclib.py", line 1439, in send_content
connection.endheaders(request_body)
File "/usr/local/python2.7/lib/python2.7/httplib.py", line 991, in endheaders
self._send_output(message_body)
File "/usr/local/python2.7/lib/python2.7/httplib.py", line 844, in _send_output
self.send(msg)
File "/usr/local/python2.7/lib/python2.7/httplib.py", line 806, in send
self.connect()
File "/usr/local/python2.7/lib/python2.7/httplib.py", line 787, in connect
self.timeout, self.source_address)
File "/usr/local/python2.7/lib/python2.7/socket.py", line 571, in create_connection
raise err
socket.error: [Errno 111] Connection refused
不过,我的tank已经上传了zookeeper的包了。
ID Package Name Revision No. Timestamp Checksum Download 1 zookeeper-3.4.6.tar.gz r12345 20141015-172923 2a9e53f5990dfe0965834a525fbcad226bf93474 Download
看上去,你的第一台布成功了,第二台在连接supervisord的时候没连上,connection refused,应该是对应机器上的supervisord没启来吧,你检查一下?
我部署了3台机器,发现三台的9001都能访问,但是三台的zookeeper的process都启动失败。supervisor页面的内容均如下:
State Description Name Action
running
pid 21263, uptime 0:06:29 crashmailbatch-monitor Restart Stop Clear Log Tail -f
running
pid 21262, uptime 0:06:29 processexit-monitor Restart Stop Clear Log Tail -f
fatal
Exited too quickly (process log may have details) zookeeper--dptst--zookeeper Start Clear Log Tail -f
PS:9001的页面出来了,supervisord有可能还没启动吗?
9001页面成功的话,应该supervisor就是启动成功了。 process启动失败报的啥错,也是connection refused吗?
恩,./deploy bootstrap zookeeper dptst命令的结果也是connection refused。
你在运行客户端的机器上,wget http://$host:9001 这个页面,看看能不能正常访问
是,我的错,要把所有的机器都先部署上supervisor,再运行./deploy bootstrap zookeeper dptst。现在正常了。 不过新的问题又来了。./deploy show zookeeper dptst 出现错误。
2014-10-16 09:44:01 Showing task 0 of zookeeper on 10.161.156.199(0)
2014-10-16 09:44:01 Task 0 of zookeeper on 10.161.156.199(0) is FATAL
2014-10-16 09:44:01 Showing task 1 of zookeeper on 10.162.20.204(0)
2014-10-16 09:44:01 Task 1 of zookeeper on 10.162.20.204(0) is FATAL
2014-10-16 09:44:01 Showing task 2 of zookeeper on 10.161.131.193(0)
2014-10-16 09:44:01 Task 2 of zookeeper on 10.161.131.193(0) is FATAL
这个是zookeeper没正常起来,你查看一下zookeeper的log吧,看看是什么原因没启动
是/home/root/log/zookeeper/dptst/zookeeper目录吗?都是空目录
[root@slave1 ~]# cd /home/root/log/zookeeper/dptst/zookeeper
[root@slave1 zookeeper]# ll
总用量 0
到/home/root/app/zookeeper/dptst/zookeeper下, stdout/ 下面有标准输出重定向的文件,看看里面有没有什么错误信息?
问题找到了,应该是java的为问题。 java是我手动安装的,装在/usr/local/jdk1.7.0_67/,/etc/profile里的环境变量也配了,直接运行java -version也正常。 但是查看stdout里的记录,zookeeper还是去/usr/bin/java查找,minos的javahome要去那里配置?
弱弱地再问一个问题,看完pdf和wiki,minos的安装方式是不需要配置ssh免密码登录,全部靠supervisior是吧?
实在搞不明白,只好链接过去了:ln -s $JAVA_HOME/bin/java /usr/bin/java
算是马马虎虎搞定了。现在zookeeper正常了
赞,搞起来了就好。 minos中,java_home的获取在start.sh里有,你可以看一下源码。
Your password is: 123456, you should store this in a safe place, because this is the verification code used to do cleanup Bootstrapping task 0 of zookeeper on 192.38.11.59(0) Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: No package found on package server of zookeeper Bootstrap task 0 of zookeeper on 192.38.11.59(0) fail: 2 Starting task 0 of zookeeper on 192.38.11.59(0) Start task 0 of zookeeper on 192.38.11.59(0) fail: You should bootstrap the job first