Open edusperoni opened 6 years ago
I'm having a hard time replicating this with a small use case as below:
root@localhost:~# cat /srv/salt/test.sls
test:
file.managed:
- name: /tmp/test
- watch_in:
- service: apache2
apache2:
service.running:
- enable: True
When running the state it restarts the service:
----------
ID: apache2
Function: service.running
Result: True
Comment: Service restarted
Started: 14:31:23.317472
Duration: 2242.205 ms
Changes:
----------
apache2:
True
Summary for local
I tried capturing what I found in the states you pointed out and applyling it to this service. Is the use case the same?
Try running the following commands:
systemctl disable apache2 service apache2 start
I'm not near my computer at this moment, but it should do the trick.
The idea is that systemctl status should report the service as disabled AND running
I just tripled check that i ran those commands before running the state and I am still seeing the service restart. I even just changed it to zabbix-agent and cannot replicate.
Heres a detailed use case:
root@localhost:~# systemctl disable zabbix-agent
Synchronizing state for zabbix-agent.service with sysvinit using update-rc.d...
Executing /usr/sbin/update-rc.d zabbix-agent defaults
insserv: warning: current start runlevel(s) (empty) of script `zabbix-agent' overrides LSB defaults (2 3 4 5).
insserv: warning: current stop runlevel(s) (0 1 2 3 4 5 6) of script `zabbix-agent' overrides LSB defaults (0 1 6).
Executing /usr/sbin/update-rc.d zabbix-agent disable
insserv: warning: current start runlevel(s) (empty) of script `zabbix-agent' overrides LSB defaults (2 3 4 5).
insserv: warning: current stop runlevel(s) (0 1 2 3 4 5 6) of script `zabbix-agent' overrides LSB defaults (0 1 6).
root@localhost:~# service zabbix-agent start
root@localhost:~# sudo salt-call --local state.sls test
[WARNING ] State for file: /tmp/test - Neither 'source' nor 'contents' nor 'contents_pillar' nor 'contents_grains' was defined, yet 'replace' was set to 'True'. As there is no source to replace the file with, 'replace' has been set to 'False' to avoid reading the file unnecessarily.
local:
----------
ID: test
Function: file.managed
Name: /tmp/test
Result: True
Comment: Empty file
Started: 14:01:50.789805
Duration: 36.405 ms
Changes:
----------
new:
file /tmp/test created
----------
ID: zabbix-agent
Function: service.running
Result: True
Comment: Service restarted
Started: 14:01:51.760442
Duration: 109.101 ms
Changes:
----------
zabbix-agent:
True
Summary for local
------------
Succeeded: 2 (changed=2)
Failed: 0
------------
Total states run: 2
Total run time: 145.506 ms
i'm wondering if my state is just not setup in the same way as your use case
Oh, I see now, you're using sysvinit, not systemd. Try it using either Ubuntu 16.04 or Debian 8 (or newer)
my understanding is debian ensures they synchronize the services through both sysvinit and systemd. I even confirmed on the minion that it was running the salt/modules/systemd.py file.
also to note i tested on both debian 8 and ubuntu 16
That's weird, it happened to both me and another commenter on the issue I linked, and we were able to replicate it by running:
systemctl disable zabbix-agent
apt-get purge zabbix-agent -y
and then applying the formula. (or applying the formula on a clean system)
Can you try with the formula I linked (my fork is on the correct commit, before the workaround)? Just use zabbix.agent.repo and zabbix.agent.conf and this pillar (just in case):
zabbix:
lookup:
version_repo: 3.4
Although it should work the same on debian 8, you could try it on debian 9, which is the exact version I'm using.
I can confirm, problem exist. I just checked one more time.
Disable and remove agent:
systemctl disable zabbix-agent.service; systemctl stop zabbix-agent.service; apt-get -y purge zabbix-agent
Install it again:
salt 'testvm*' state.sls zabbix.agent.repo,zabbix.agent.conf
Name: deb http://repo.zabbix.com/zabbix/3.0/debian jessie main - Function: pkgrepo.managed - Result: Clean Started: - 22:00:40.098412 Duration: 30.905 ms
----------
ID: zabbix-agent
Function: pkg.installed
Result: True
Comment: The following packages were installed/updated: zabbix-agent
The following packages were already installed: zabbix-sender
Started: 22:00:40.129466
Duration: 7967.602 ms
Changes:
----------
zabbix-agent:
----------
new:
1:3.0.16-1+jessie
old:
Name: zabbix - Function: group.present - Result: Clean Started: - 22:00:48.098539 Duration: 1.68 ms
Name: zabbix - Function: user.present - Result: Clean Started: - 22:00:48.100595 Duration: 14.298 ms
Name: /var/log/zabbix - Function: file.directory - Result: Clean Started: - 22:00:48.469282 Duration: 0.6 ms
Name: /var/run/zabbix - Function: file.directory - Result: Clean Started: - 22:00:48.470001 Duration: 0.519 ms
----------
ID: /etc/zabbix/zabbix_agentd.conf
Function: file.managed
Result: True
Comment: File /etc/zabbix/zabbix_agentd.conf updated
Started: 22:00:48.470621
Duration: 99.676 ms
Changes:
----------
diff:
---
+++
@@ -1,3 +1,4 @@
+# Managed by saltstack. Do not edit this file.
... snip ...
# TLSPSKFile=
+
----------
ID: zabbix-agent
Function: service.running
Result: True
Comment: Service zabbix-agent has been enabled, and is in the desired state
Started: 22:00:48.570671
Duration: 173.452 ms
Changes:
----------
zabbix-agent:
True
zabbix-agent log file
12498:20180502:220047.523 agent #2 started [listener #1]
12501:20180502:220047.525 active check configuration update from [127.0.0.1:10051] started to fail (cannot connect to [[127.0.0.1]:10051]: [111] Connection refused)
It's still using 127.0.0.1 as server, restart was not performed.
I will try to replicate this issue with simple test state later. May be there is some race condition.
Something weird is going on with this state. From what I observed, when the service is dead and needs to be started AND enabled it's only gets started. The next run enables the service.
are any of you seeing this with any other services or only zabbix? Also are you certain this was working on previous salt versions?
I'm seeing this with docker.service, not sure if it's salt, docker or systemd problem :)
and do you know if it was working on previous versions?
Hello, @Ch3LL!
I'd try to bring this up again. The issue is reproducable and can be simplified to the description below.
Assume, we have an .sls
with:
/etc/unbound/unbound.conf:
file.managed:
- require:
- pkg: unbound
- user: unbound
- contents: |
include: "/etc/unbound/unbound.conf.d/*.conf"
#test file changes
service
state with a watch
requisite:
unbound.service:
service.running:
- enable: True
- require:
- cmd: unbound-systemctl-daemon-reload
- pkg: unbound
- watch:
- file: /etc/systemd/system/unbound.service
- file: /etc/unbound/unbound.conf
- file: /etc/unbound/unbound.conf.d
In this case on applying state we get:
----------
ID: /etc/unbound/unbound.conf
Function: file.managed
Result: True
Comment: File /etc/unbound/unbound.conf updated
Started: 20:18:22.972941
Duration: 14.977 ms
Changes:
----------
diff:
---
+++
@@ -1 +1,2 @@
include: "/etc/unbound/unbound.conf.d/*.conf"
+#test file changes
...
----------
ID: unbound.service
Function: service.running
Name: unbound
Result: True
Comment: Service unbound has been enabled, and is in the desired state
Started: 20:18:23.218767
Duration: 531.541 ms
Changes:
----------
unbound:
True
--------------------------------
$ systemctl status unbound
● unbound.service - Unbound recursive Domain Name Server
Loaded: loaded (/etc/systemd/system/unbound.service; enabled; vendor preset: enabled)
Drop-In: /run/systemd/generator/unbound.service.d
└─50-insserv.conf-$named.conf, 50-unbound-$named.conf
Active: active (running) since Thu 2019-08-08 16:56:43 MSK; 3h 21min ago
Main PID: 1133 (unbound)
CGroup: /system.slice/unbound.service
└─1133 /usr/sbin/unbound -c /etc/unbound/unbound.conf
and the service hasn't been restarted (take a look on active since) despite the watch for the config.
When the service is already enabled, but we still have changes in the watched config file, applying the state results in the service's restart correctly:
----------
ID: /etc/unbound/unbound.conf
Function: file.managed
Result: True
Comment: File /etc/unbound/unbound.conf updated
Started: 20:29:21.331920
Duration: 6.818 ms
Changes:
----------
diff:
---
+++
@@ -1,2 +1 @@
include: "/etc/unbound/unbound.conf.d/*.conf"
-#test file changes
...
----------
ID: unbound.service
Function: service.running
Name: unbound
Result: True
Comment: Service restarted
Started: 20:29:21.638848
Duration: 377.89 ms
Changes:
----------
unbound:
True
--------------------------------
$ systemctl status unbound
● unbound.service - Unbound recursive Domain Name Server
Loaded: loaded (/etc/systemd/system/unbound.service; enabled; vendor preset: enabled)
Drop-In: /run/systemd/generator/unbound.service.d
└─50-insserv.conf-$named.conf, 50-unbound-$named.conf
Active: active (running) since Thu 2019-08-08 20:31:09 MSK; 37s ago
Process: 13367 ExecStart=/usr/sbin/unbound -c /etc/unbound/unbound.conf (code=exited, status=0/SUCCESS)
Process: 13360 ExecStartPre=/usr/sbin/unbound-checkconf (code=exited, status=0/SUCCESS)
Process: 13356 ExecStartPre=/usr/sbin/unbound-anchor -a /var/lib/unbound/root.key -v (code=exited, status=0/SUCCESS)
Main PID: 13369 (unbound)
Tasks: 1
Memory: 5.6M
CPU: 36ms
CGroup: /system.slice/unbound.service
└─13369 /usr/sbin/unbound -c /etc/unbound/unbound.conf
I believe, service
state should trigger systemd-service's enabling and restarting (or starting if it was inactive) in Situation 1. It also seems to me, that the issue is not somehow service-related. We're currently using 2018.3.2
version.
Could you take another look on it, please?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.
ping @bandikoot thanks for the example and sorry for the late reply. thanks to stalebot I saw this issue today.
I was finally able to replicate this and I'm just gonna paste my exact steps so i don't have an issue replicating this in the future:
Using this vagrant file:
Vagrant.configure("2") do |config|
config.vm.box = "debian/buster64"
end
apt-get install unbound
systemctl disable unbound
/etc/unbound/unbound.conf:
file.managed:
- user: unbound
- contents: |
include: "/etc/unbound/unbound.conf.d/*.conf"
#test file changes
unbound.service:
service.running:
- enable: True
- watch:
- file: /etc/unbound/unbound.conf
salt-call --local state.sls test -ldebug
and now i see the issue. will label as a bug for us to get fixed thanks
Thank you for updating this issue. It is no longer marked as stale.
Confirmed on:
Given the text in the watch requisite documentation, it seems this is the intended (or at least expected) behavior.
If the watching state changes key contains values, then mod_watch will not be called. If you're using watch or watch_in then it's a good idea to have a state that only enforces one attribute - such as splitting out service.running into its own state and have service.enabled in another.
I haven't dug into the code for this too deep yet, but does anyone know if this is something that's "fixable" in mod_watch
for the service state without fundamentally changing global watch
behavior?
Description of Issue/Question
When a systemd service is created, started, but not enabled, salt does not restart the service on mod_watch:
Some packages (zabbix, especifically), don't enable their services when installed, so Salt only enables it, instead of enabling+restarting.
This issue is discussed more thoroughly at this PR: https://github.com/saltstack-formulas/zabbix-formula/pull/77
Setup
Formula: https://github.com/edusperoni/zabbix-formula states:
Steps to Reproduce Issue
On a machine that has never installed zabbix-agent, install it using the formula above.
On a machine that has zabbix-agent installed, run
systemctl disable zabbix-agent && apt-get purge zabbix-agent -y
and then apply the state.Versions Report