openvstorage / framework

The Framework is a set of components and tools which brings the user an interface (GUI / API) to setup, extend and manage an Open vStorage platform.
Other
27 stars 23 forks source link

Setup fails on second node with "'rabbitmq-server' '-detached'' returned non-zero exit status 1 #1171

Closed JeffreyDevloo closed 7 years ago

JeffreyDevloo commented 7 years ago

Problem description

Installing the latest packages (openvstorage 2.7.6-rev.4309.4de95cf-1) will fail on the second node. During the extend I received

"'rabbitmq-server' '-detached'' returned non-zero exit status 1

Logs

2016-11-17 14:56:59 24300 +0100 - ovs-node-2 - 21523/140546559538944 - lib/setup - 251 - ERROR - An unexpected error occurred:
Traceback (most recent call last):
  File "/opt/OpenvStorage/ovs/lib/setup.py", line 422, in setup_node
    configure_rabbitmq=configure_rabbitmq)
  File "/opt/OpenvStorage/ovs/lib/setup.py", line 1322, in _promote_node
    target_client.run(['rabbitmq-server', '-detached'])
  File "/opt/OpenvStorage/ovs/extensions/generic/sshclient.py", line 59, in inner_function
    return outer_function(self, *args, **kwargs)
  File "/opt/OpenvStorage/ovs/extensions/generic/sshclient.py", line 265, in run
    raise CalledProcessError(exit_code, command, stdout)
CalledProcessError: Command ''rabbitmq-server' '-detached'' returned non-zero exit status 1
2016-11-17 14:56:59 24300 +0100 - ovs-node-2 - 21523/140546559538944 - lib/setup - 252 - ERROR - Command ''rabbitmq-server' '-detached'' returned non-zero exit status 1
Traceback (most recent call last):
  File "/opt/OpenvStorage/ovs/lib/setup.py", line 422, in setup_node
    configure_rabbitmq=configure_rabbitmq)
  File "/opt/OpenvStorage/ovs/lib/setup.py", line 1322, in _promote_node
    target_client.run(['rabbitmq-server', '-detached'])
  File "/opt/OpenvStorage/ovs/extensions/generic/sshclient.py", line 59, in inner_function
    return outer_function(self, *args, **kwargs)
  File "/opt/OpenvStorage/ovs/extensions/generic/sshclient.py", line 265, in run
    raise CalledProcessError(exit_code, command, stdout)
CalledProcessError: Command ''rabbitmq-server' '-detached'' returned non-zero exit status 1

Manual execution

lroot@ovs-node-2:~# less /var/log/ovs/lib.log 
root@ovs-node-2:~# rabbitmq-server -detached
Warning: PID file not written; -detached was passed.
{error_logger,{{2016,11,17},{14,59,33}},"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[]}
{error_logger,{{2016,11,17},{14,59,33}},crash_report,[[{initial_call,{auth,init,['Argument__1']}},{pid,<0.21.0>},{registered_name,[]},{error_info,{exit,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,352}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[<0.19.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,27},{reductions,636}],[]]}
{error_logger,{{2016,11,17},{14,59,33}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}},{offender,[{pid,undefined},{id,auth},{mfargs,{auth,start_link,[]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2016,11,17},{14,59,33}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}},{offender,[{pid,undefined},{id,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2016,11,17},{14,59,33}},crash_report,[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.9.0>},{registered_name,[]},{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,134}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}},{ancestors,[<0.8.0>]},{messages,[{'EXIT',<0.10.0>,normal}]},{links,[<0.8.0>,<0.7.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,27},{reductions,181}],[]]}
{error_logger,{{2016,11,17},{14,59,33}},std_info,[{application,kernel},{exited,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces",[{auth,init_cookie,0,[{file,"auth.erl"},{line,286}]},{auth,init,1,[{file,"auth.erl"},{line,140}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}}},{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{\"Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces\",[{auth,init_cookie,0,[{file,\"auth.erl\"},{line,286}]},{auth,init,1,[{file,\"auth.erl\"},{line,140}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,240}]}]}}}}},{kernel,start,[normal,[]]}}}"}

Crash dump is being written to: erl_crash.dump...done
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,auth,{"Error when reading /var/lib/rabbi

Could potentially be linked to https://github.com/openvstorage/framework/issues/1170

khenderick commented 7 years ago

The issue is that before we just wrote over the existing files, keeping all its rights, owner, ... Now, we create a new temp file, and rename it to the target file. This results in all rights, owner, ... from the new file, so fairly "default". The solution is copying a possible existing file while preserving rights, owner, ... and then writing over it, and then moving it.

khenderick commented 7 years ago

Fixed by #1172, openvstorage/alba-asdmanager#161, packaged in openvstorage-2.7.6-rev.4311.9e78cea, openvstorage-sdm-1.6.6-rev.445.4216f7e

JeffreyDevloo commented 7 years ago

Steps

Output

Cluster installed perfectly

PLAY RECAP *********************************************************************
ctl01                      : ok=13   changed=9    unreachable=0    failed=0   
ctl02                      : ok=13   changed=9    unreachable=0    failed=0   
ctl03                      : ok=13   changed=9    unreachable=0    failed=0   

Thursday 17 November 2016  17:14:22 +0100 (0:01:52.141)       0:07:10.961 ***** 
=============================================================================== 
configuring the open vstorage controllers ----------------------------- 112.14s
install open vstorage hyperconverged packages -------------------------- 91.68s
install required packages for blktap ----------------------------------- 60.84s
install django dependancy for Ubuntu 16.04 ----------------------------- 13.27s
update packages -------------------------------------------------------- 11.13s
install required packages for ansible ----------------------------------- 3.50s
setup ------------------------------------------------------------------- 1.38s
add openvstorage apt-preference Ubuntu 16.04 ---------------------------- 0.59s
add openvstorage apt-key Ubuntu 16.04 ----------------------------------- 0.52s
add openvstorage apt-repo Ubuntu 16.04 ---------------------------------- 0.34s
check hosts their availability ------------------------------------------ 0.32s
add performance settings to cluster ------------------------------------- 0.27s
wait for ssh to come up ------------------------------------------------- 0.25s
add openvstorage apt-repo Ubuntu 14.04 ---------------------------------- 0.07s

Test result

Test passed.

Packages