saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
13.98k stars 5.47k forks source link

[BUG] [3007.1] Nonce verification error #66629

Open AppCrashExpress opened 2 weeks ago

AppCrashExpress commented 2 weeks ago

Description

Hello!

We are currently trying to update Saltstack to 3007.1.

Some of our minions run on a slower network over TCP transport. This causes salt.exceptions.SaltClientError: Nonce verification error to resurface, much like the fixed issue: https://github.com/saltstack/salt/issues/65114

This issue was tested on the commit ID: 2b266935e42780ebd86b3a144d6897f0ae174b2b

Setup

This is shortened configuration, since we use patched custom installation and I'm not sure what I'm allowed to show, but it should be sufficient given the nature of the issue:

master:
  - salt-master.example.org

ipv6: true
transport: tcp 
random_master: True

log_level: debug

auth_timeout: 10
acceptance_wait_time: 10
auth_tries: 2
random_reauth_delay: 60

Should you find it insufficient, please let me know.

Steps to Reproduce the behavior

  1. Start salt-master and salt-minion on a separate machines, connected by slow TCP network (must cause retries)
  2. Run state.apply from salt master

This might inconsistently cause the following error:

Nonce verification error ``` --------- ID: lxc_container_copy_salt_config Function: file.recurse Name: /var/lib/lxc/container/rootfs/etc/salt/ Result: False Comment: An exception occurred in this state: Traceback (most recent call last): File "/opt/saltstack/salt/state.p line 2430, in call ret = self.states[cdata["full"]]( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 161, in __call__ ret = self.loader.run(run_func, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1283, in run return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1298, in _run_as return _func_or_method(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1331, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/states/file.py", line 4594, in recurse source, source_hash = __salt__["file.source_list"](source_list, "", __env__) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 161, in __call__ ret = self.loader.run(run_func, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1283, in run return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1298, in _run_as return _func_or_method(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/modules/file.py", line 4521, in source_list mfiles = [(f, saltenv) for f in __salt__["cp.list_master"](saltenv)] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 161, in __call__ ret = self.loader.run(run_func, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1283, in run return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/loader/lazy.py", line 1298, in _run_as return _func_or_method(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/modules/cp.py", line 770, in list_master return client.file_list(saltenv, prefix) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/fileclient.py", line 1366, in file_list return self._channel_send( ^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/fileclient.py", line 1147, in _channel_send return self.channel.send( ^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/utils/asynchronous.py", line 139, in wrap raise exc_info[1].with_traceback(exc_info[2]) File "/opt/saltstack/salt/utils/asynchronous.py", line 147, in _target result = io_loop.run_sync(lambda: getattr(self.obj, key)(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/ioloop.py", line 539, in run_sync return future_cell[0].result() ^^^^^^^^^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/gen.py", line 780, in run yielded = self.gen.throw(exc) ^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/channel/client.py", line 340, in send ret = yield self._crypted_transfer(load, timeout=timeout, raw=raw) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/gen.py", line 767, in run value = future.result() ^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/gen.py", line 780, in run yielded = self.gen.throw(exc) ^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/channel/client.py", line 294, in _crypted_transfer ret = yield _do_transfer() ^^^^^^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/gen.py", line 767, in run value = future.result() ^^^^^^^^^^^^^^^ File "contrib/python/tornado/tornado-6/tornado/gen.py", line 786, in run yielded = self.gen.send(value) ^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/channel/client.py", line 284, in _do_transfer data = self.auth.crypticle.loads(data, raw, nonce=nonce) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/saltstack/salt/crypt.py", line 1729, in loads raise SaltClientError(f"Nonce verification error {ret_nonce} {nonce}") salt.exceptions.SaltClientError: Nonce verification error 24c5d713ee2c494aa4f6dc1318478484 ce07fd7bd2144679bf5fdc6a8d7c4044 Started: 16:38:26.536696 Duration: 72.635 ms Changes: ```

Expected behavior Retried messages should not cause nonce verification error

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml Salt Version: Salt: 3007.1 Python Version: Python: 3.12.3 (main, May 13 2024, 10:19:24) [Clang 16.0.6 ] Dependency Versions: cffi: 1.16.0 cherrypy: Not Installed dateutil: 2.9.0.post0 docker-py: 7.0.0 gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.4 libgit2: Not Installed looseversion: 1.3.0 M2Crypto: 0.38.0 Mako: 1.3.3 msgpack: 1.0.8 msgpack-pure: Not Installed mysql-python: 1.4.6 packaging: 21.3 pycparser: 2.22 pycrypto: Not Installed pycryptodome: Not Installed pygit2: Not Installed python-gnupg: 0.5.2 PyYAML: 5.4.1 PyZMQ: 25.1.2 relenv: Not Installed smmap: 5.0.1 timelib: 0.3.0 Tornado: 6.4 ZMQ: 4.1.2 Salt Package Information: Package Type: Not Installed System Versions: dist: ubuntu 22.04.2 jammy locale: utf-8 machine: x86_64 release: 5.4.210-39.1 system: Linux version: Ubuntu 22.04.2 jammy ```

Additional context

The code, initially fixed by https://github.com/saltstack/salt/pull/65247, has been reintroduced in the following snippet: https://github.com/saltstack/salt/blob/v3007.1/salt/transport/tcp.py#L1828-L1840 Seems like the fix would be to simply reimplement the PR again on a new code.