Closed edgan closed 2 years ago
I have seen this go both ways, false positives and false negatives.Which makes sense, because it is a caching problem. What makes it really dangerous is you could merge bad pillars, even with testing, because you got a false positive.
@saltstack/team-core Any ideas on this one?
I guess it is somehow related to #31907 .. I've the same issue with salt, edit a pillar, push to git, did a refresh (also restart the minion) but it sill doesn't show the (new) pillar value..
salt -V
Salt Version:
Salt: 3003
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: 2.7.3
docker-py: Not Installed
gitdb: 2.0.5
gitpython: 2.1.11
Jinja2: 2.10
libgit2: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 0.5.6
msgpack-pure: Not Installed
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: 3.6.1
pygit2: Not Installed
Python: 3.7.3 (default, Jan 22 2021, 20:04:44)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 17.1.2
smmap: 2.0.5
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.3.1
System Versions:
dist: debian 10 buster
locale: UTF-8
machine: x86_64
release: 4.19.0-16-amd64
system: Linux
version: Debian GNU/Linux 10 buster
when I run the salt minion in debug mode I got this: refresh pillar:
[INFO ] User sudo_BOFH Executing command saltutil.refresh_pillar with jid 20210622082604472251
[DEBUG ] Command details {'fun': 'saltutil.refresh_pillar', 'arg': [], 'tgt': 'salt-minion-1.bofh.home.de', 'jid': '20210622082604472251', 'ret': '', 'tgt_type': 'glob', 'user': 'sudo_BOFH'}
[DEBUG ] Subprocess ProcessPayload-Job-20210622082604472251 added
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[INFO ] Starting a new job 20210622082604472251 with PID 25374
[DEBUG ] Could not LazyLoad direct_call.allow_missing_func: 'direct_call.allow_missing_func' is not available.
[DEBUG ] LazyLoaded saltutil.refresh_pillar
[DEBUG ] LazyLoaded event.fire
[DEBUG ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_41ef1aaeeb_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_41ef1aaeeb_pull.ipc
[DEBUG ] Sending event: tag = pillar_refresh; data = {'_stamp': '2021-06-22T08:26:04.556245'}
[DEBUG ] Minion of 'salt-test' is handling event tag 'pillar_refresh'
[DEBUG ] Refreshing modules. Notify=False
[DEBUG ] Grains refresh requested. Refreshing grains.
[DEBUG ] Reading configuration from /etc/salt/minion
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Closing IPCMessageClient instance
[DEBUG ] Minion return retry timer set to 9 seconds (randomized)
[INFO ] Returning information for job: 20210622082604472251
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.16.xxx.xx:4506
[DEBUG ] Trying to connect to: tcp://172.16.xxx.xx:4506
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] minion return: {'success': True, 'return': True, 'retcode': 0, 'jid': '20210622082604472251', 'fun': 'saltutil.refresh_pillar', 'fun_args': []}
[DEBUG ] Override __utils__: <module 'salt.loaded.int.grains.zfs' from '/usr/lib/python3/dist-packages/salt/grains/zfs.py'>
[DEBUG ] /etc/resolv.conf: The domain and search keywords are mutually exclusive.
[DEBUG ] Unable to resolve address fe80::250:56ff:fe8c:1936: [Errno 1] Unknown host
[DEBUG ] Elapsed time getting FQDNs: 0.10307979583740234 seconds
[DEBUG ] LazyLoaded zfs.is_supported
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Refreshing pillar.
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506', 'aes')
[DEBUG ] Re-using AsyncAuth for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.16.xxx.xx:4506
[DEBUG ] Trying to connect to: tcp://172.16.xxx.xx:4506
[DEBUG ] LazyLoaded config.merge
[DEBUG ] LazyLoaded mine.update
[DEBUG ] Subprocess ProcessPayload-Job-20210622082604472251 cleaned up
[DEBUG ] salt.crypt.get_rsa_key: Loading private key
[DEBUG ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] Refreshing matchers.
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Refreshing beacons.
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_41ef1aaeeb_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_41ef1aaeeb_pull.ipc
[DEBUG ] Sending event: tag = /salt/minion/minion_pillar_refresh_complete; data = {'complete': True, '_stamp': '2021-06-22T08:26:05.363962'}
[DEBUG ] Closing IPCMessageClient instance
[DEBUG ] Minion of 'salt-test' is handling event tag '/salt/minion/minion_pillar_refresh_complete'
ask for pillars: (pillar.items)
[DEBUG ] LazyLoaded glob_match.match
[INFO ] User sudo_BOFH Executing command pillar.item with jid 20210622082832819002
[DEBUG ] Command details {'fun': 'pillar.item', 'arg': [], 'tgt': 'salt-minion-1*', 'jid': '20210622082832819002', 'ret': '', 'tgt_type': 'glob', 'user': 'sudo_BOFH'}
[DEBUG ] Subprocess ProcessPayload-Job-20210622082832819002 added
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[INFO ] Starting a new job 20210622082832819002 with PID 25467
[DEBUG ] Could not LazyLoad direct_call.allow_missing_func: 'direct_call.allow_missing_func' is not available.
[DEBUG ] LazyLoaded pillar.item
[DEBUG ] Minion return retry timer set to 8 seconds (randomized)
[INFO ] Returning information for job: 20210622082832819002
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.16.xxx.xx:4506
[DEBUG ] Trying to connect to: tcp://172.16.xxx.xx:4506
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] minion return: {'success': True, 'return': {}, 'retcode': 0, 'jid': '20210622082832819002', 'fun': 'pillar.item', 'fun_args': []}
[DEBUG ] Subprocess ProcessPayload-Job-20210622082832819002 cleaned up
ask for specific pillar:
[INFO ] User sudo_BOFH Executing command pillar.items with jid 20210622083204983233
[DEBUG ] Command details {'fun': 'pillar.items', 'arg': ['roles'], 'tgt': 'salt-minion-1*', 'jid': '20210622083204983233', 'ret': '', 'tgt_type': 'glob', 'user': 'sudo_BOFH'}
[DEBUG ] Subprocess ProcessPayload-Job-20210622083204983233 added
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] LazyLoaded jinja.render
[DEBUG ] LazyLoaded yaml.render
[DEBUG ] Using pkg_resources to load entry points
[DEBUG ] Using pkg_resources to load entry points
[INFO ] Starting a new job 20210622083204983233 with PID 25579
[DEBUG ] Could not LazyLoad direct_call.allow_missing_func: 'direct_call.allow_missing_func' is not available.
[DEBUG ] LazyLoaded pillar.items
[DEBUG ] Minion return retry timer set to 6 seconds (randomized)
[INFO ] Returning information for job: 20210622083204983233
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506', 'aes')
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'salt-minion-1.bofh.home.de', 'tcp://172.16.xxx.xx:4506')
[DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://172.16.xxx.xx:4506
[DEBUG ] Trying to connect to: tcp://172.16.xxx.xx:4506
[DEBUG ] Closing AsyncZeroMQReqChannel instance
[DEBUG ] minion return: {'success': True, 'return': {'roles': ''}, 'retcode': 0, 'jid': '20210622083204983233', 'fun': 'pillar.items', 'fun_args': ['roles']}
[DEBUG ] Subprocess ProcessPayload-Job-20210622083204983233 cleaned up
actual behavior:
salt "salt-minion-1*" pillar.items roles
salt-minion-1.bofh.home.de:
----------
roles:
expected behavior:
salt "salt-minion-1*" pillar.items roles
salt-minion-1.bofh.home.de:
----------
roles: webserver
Description Salt-minion pillars don't always refresh when they should.
This is trickery to reproduce and seems to involve a race condition. The steps to reproduce have to be executed very fast.
I am pretty sure from testing that this is caused by changes in this PR, https://github.com/saltstack/salt/pull/54942 . I tried reverted the patches and the issue went away.
Steps to Reproduce the behavior Locally:
Master:
Locally:
Master:
Expected behavior The error on the first state.apply is expected. There is a pillar syntax error. The error on the second state.apply is not expected, because the syntax has been fixed. The git_pillar.update has pulled in new pillars as we can see by the True for salt-pillars. When we then refresh_pillar for that minion we get the correct result we should have gotten on the second state.apply.
Versions Report