olopez32 / ganeti

Automatically exported from code.google.com/p/ganeti
0 stars 0 forks source link

gnt-cluster master-failover does not work on RHEL 7 (systemd/systemctl issue?) #1007

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I have a 2.12.0 cluster on CentOS / RHEL 7. "gnt-cluster master-failover" does 
not work. I've traced the process and realized that it's failing to start 
ganeti-wconfd. systemctl status shows "Not master, exiting.".

daemon-util is called (apparently) correctly:
execve("/usr/lib64/ganeti/daemon-util", ["/usr/lib64/ganeti/daemon-util", 
"start", "ganeti-wconfd", "--force-node", "--no-voting", "--yes-do-it"], [/* 22 
vars */]

but systemctl is not called with any arguments:
execve("/bin/systemctl", ["systemctl", "start", "ganeti-wconfd.service"], [/* 
22 vars */]) 

It seems to me like the extra arguments are being ignored in daemon-util - see 
start() as of 2.12.0:
http://git.ganeti.org/?p=ganeti.git;a=blob;f=daemons/daemon-util.in;h=6a472531dd
c5ecb6105a5b2ac3084ecaabb73c9c;hb=37975867b5958f7b510aea8ecbd5c4610b6cc234#l266 
. The call to start-stop-daemon is called with extra arguments, but the call to 
systemctl start is not.

I've read systemctl documentation briefly and there does not appear to be a way 
to pass arguments on the command line when starting a service (?).

-----

master# gnt-cluster --version
gnt-cluster (ganeti v2.12.0) 2.12.0
master# gnt-cluster version
Software version: 2.12.0
Internode protocol: 2120000
Configuration format: 2120000
OS api version: 20
Export interface: 0
VCS version: (ganeti) version v2.12.0
master# hspace --version
hspace (ganeti) version v2.12.0
compiled with ghc 7.6
running on linux x86_64

-----

newmaster# gnt-cluster master-failover
Timeout while talking to the master daemon. Jobs might have been submitted and 
will continue to run even if the call timed out. Useful commands in this 
situation are "gnt-job list", "gnt-job cancel" and "gnt-job watch". Error:
Connect timed out

Original issue reported on code.google.com by jhujh...@gmail.com on 8 Dec 2014 at 10:21

GoogleCodeExporter commented 9 years ago

Original comment by hel...@google.com on 9 Dec 2014 at 8:14

GoogleCodeExporter commented 9 years ago
systemd doesn't seem to take additional arguments.
I tested the following means using a temporary file as one of solutions.

* modify /usr/lib/systemd/system/ganeti-wconfd.service
...
[Service]
Type = simple
User = root
Group = root
EnvironmentFile = -/var/lib/ganeti/ganeti-wconfd.options
ExecStart = /usr/sbin/ganeti-wconfd -f $OPTIONS
Restart = on-failure
SuccessExitStatus = 0 11
...

* modify /usr/lib64/ganeti/daemon-util
...
  if use_systemctl; then
    echo "OPTIONS=$@" > /var/lib/ganeti/${name}.options
    systemctl start "${name}.service"
    ret=$?
    rm -f /var/lib/ganeti/${name}.options
    return $ret
  fi
...

Original comment by jun.futa...@gmail.com on 9 Dec 2014 at 6:09

GoogleCodeExporter commented 9 years ago

Original comment by pud...@google.com on 6 May 2015 at 2:00