rabbitmq / chef-cookbook

Development repository for Chef cookbook RabbitMQ
https://supermarket.chef.io/cookbooks/rabbitmq
Apache License 2.0
214 stars 424 forks source link

CentOS 7 is broken (in dokken) #435

Closed rmoriz closed 7 years ago

rmoriz commented 7 years ago

(you can skip to the bottom or just read how I waste my weekends…)

service[rabbitmq-server] action start hangs until chef-client exits.

See: https://travis-ci.org/rabbitmq/chef-cookbook/jobs/221400408

I can reproduce this issue on dokken/docker.

rmoriz commented 7 years ago
[root@01b3d8c9ee66 /]# systemctl status rabbitmq-server.service
● rabbitmq-server.service - RabbitMQ broker
   Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
   Active: activating (start) since Sat 2017-04-22 14:49:26 UTC; 14min ago
 Main PID: 3169 (beam.smp)
   CGroup: /docker/01b3d8c9ee668b5396d2d374bb279181567a5f73840432192d5bd9bb62b14eea/system.slice/rabbitmq-server.service
           ├─3169 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 32000 -K true -- -root /usr/li...
           ├─3393 inet_gethost 4
           └─3394 inet_gethost 4
           ‣ 3169 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 32000 -K true -- -root /usr/li...

Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: ##  ##      Licensed under the MPL.  See http://www.rabbitmq.com/
Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: ##  ##
Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: ##########  Logs: /var/log/rabbitmq/rabbit@01b3d8c9ee66.log
Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: ######  ##        /var/log/rabbitmq/rabbit@01b3d8c9ee66-sasl.log
Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: ##########
Apr 22 14:49:28 01b3d8c9ee66 rabbitmq-server[3169]: Starting broker...
Apr 22 14:49:30 01b3d8c9ee66 rabbitmq-server[3169]: systemd unit for activation check: "-.slice"
Apr 22 14:49:30 01b3d8c9ee66 rabbitmq-server[3169]: Unexpected status from systemd "systemctl: invalid option -- '.'\n"
Apr 22 14:49:30 01b3d8c9ee66 rabbitmq-server[3169]: systemd READY notification failed, beware of timeouts
Apr 22 14:49:30 01b3d8c9ee66 rabbitmq-server[3169]: completed with 0 plugins.

-> notify to systemd fails -> startup never works

rmoriz commented 7 years ago

https://github.com/rabbitmq/rabbitmq-server/blob/rabbitmq_v3_6_8/src/rabbit.erl#L377 executes:

[root@01b3d8c9ee66 /]# systemctl status 3169
● -.slice - Root Slice
   Loaded: loaded (/usr/lib/systemd/system/-.slice; static; vendor preset: disabled)
   Active: active since Sat 2017-04-22 14:34:03 UTC; 33min ago
     Docs: man:systemd.special(7)
   CGroup: /docker/01b3d8c9ee668b5396d2d374bb279181567a5f73840432192d5bd9bb62b14eea
           ├─1 /usr/lib/systemd/systemd
           └─system.slice
             ├─dbus.service
             │ └─63 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
             ├─rabbitmq-server.service
             │ ├─3169 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -A 64 -P 1048576 -t 5000000 -stbt db -zdbbl 32000 -K true -- -root /us...
             │ ├─3393 inet_gethost 4
             │ └─3394 inet_gethost 4
             ├─system-epmd.slice
             │ └─epmd@0.0.0.0.service
             │   └─3281 /usr/bin/epmd -systemd
             ├─system-getty.slice
             │ └─getty@tty1.service
             │   └─81 /sbin/agetty --noclear tty1 linux
             ├─systemd-logind.service
             │ └─59 /usr/lib/systemd/systemd-logind
             ├─systemd-udevd.service
             │ └─26 /usr/lib/systemd/systemd-udevd
             └─systemd-journald.service
               └─17 /usr/lib/systemd/systemd-journald

=> "_.slice"

rmoriz commented 7 years ago

https://github.com/rabbitmq/rabbitmq-server/blob/rabbitmq_v3_6_8/src/rabbit.erl#L400

executes

[root@01b3d8c9ee66 /]# systemctl show --property=ActiveState -.slice
systemctl: invalid option -- '.'
rmoriz commented 7 years ago

Bug in Rabbitmq. They don't do proper shell-escaping…

systemctl show --property=ActiveState \\-.slice
ActiveState=inactive
rmoriz commented 7 years ago

436 makes rabbit start and report to systemd using socat/sd_notify.

BUT

tests are still broken due to a 2,5 year old serverspec version that relies on netstat (deprecated with CentOS 7) to retrieve port bindings and also breaks with non-ASCII content (which is the dot in systemctl status ...)

😭

I'll try to convert them to inspec and provide PR tomorrow.

rmoriz commented 7 years ago

Will continue in #434