saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.19k stars 5.48k forks source link

[BUG] Syndics are not able to start 3007 #66237

Open monkz opened 8 months ago

monkz commented 8 months ago

Description A clear and concise description of what the bug is.

Assertion Error on https://github.com/saltstack/salt/blob/3007.x/salt/minion.py#L3698 ff set_event_handler does not return a future therefore crashes salt-syndic on self.io_loop.add_future(future, self.reconnect_event_bus)

Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

Steps to Reproduce the behavior (Include debug logs if possible and relevant)

just try to start salt-syndic

Expected behavior A clear and concise description of what you expected to happen.

no crash

Screenshots

Traceback (most recent call last):
  File "/usr/bin/salt-syndic", line 11, in <module>
    sys.exit(salt_syndic())
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/scripts.py", line 401, in salt_syndic
    syndic.start()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/cli/daemons.py", line 621, in start
    self.syndic.tune_in()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", line 3720, in tune_in
    self.io_loop.add_future(future, self.reconnect_event_bus)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/ioloop.py", line 688, in add_future
    assert is_future(future)
AssertionError
Traceback (most recent call last):
  File "/usr/bin/salt-syndic", line 11, in <module>
    sys.exit(salt_syndic())
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/scripts.py", line 401, in salt_syndic
    syndic.start()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/cli/daemons.py", line 621, in start
    self.syndic.tune_in()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", line 3720, in tune_in
    self.io_loop.add_future(future, self.reconnect_event_bus)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/ioloop.py", line 688, in add_future
    assert is_future(future)
AssertionError

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml Salt Version: Salt: 3007.0 Python Version: Python: 3.10.13 (main, Feb 19 2024, 03:31:20) [GCC 11.2.0] Dependency Versions: cffi: 1.16.0 cherrypy: unknown dateutil: 2.8.2 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.3 libgit2: 1.7.2 looseversion: 1.3.0 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.7 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 23.1 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.19.1 pygit2: 1.14.1 python-gnupg: 0.5.2 PyYAML: 6.0.1 PyZMQ: 25.1.2 relenv: 0.15.1 smmap: Not Installed timelib: 0.3.0 Tornado: 6.3.3 ZMQ: 4.3.4 Salt Package Information: Package Type: onedir System Versions: dist: debian 12.5 bookworm locale: utf-8 machine: x86_64 release: 6.1.0-17-amd64 system: Linux version: Debian GNU/Linux 12.5 bookworm ```

Additional context Add any other context about the problem here.

welcome[bot] commented 8 months ago

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

nee2c commented 7 months ago

Just to add, I had the same issue with running 3007.0. I downgrade to 3006.7 with no other changes and works without issue.

Here is some debug logs

[DEBUG ] Reading configuration from /etc/salt/master [DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf [DEBUG ] Including configuration from '/etc/salt/master.d/syndic.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/syndic.conf [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] Guessing ID. The id can be explicitly set in /etc/salt/minion [DEBUG ] Found minion id from generate_minion_id(): xxxxxxxxxxxxxx [WARNING ] Insecure logging configuration detected! Sensitive data may be logged. [DEBUG ] Configuration file path: /etc/salt/minion [INFO ] Setting up "xxxxxxxxxxxxxxxxxxxxxxxxxxxx" the Salt Syndic [DEBUG ] Marking 'base64_encode' as a jinja filter [DEBUG ] Marking 'base64_decode' as a jinja filter [DEBUG ] Marking 'md5' as a jinja filter [DEBUG ] Marking 'sha1' as a jinja filter [DEBUG ] Marking 'sha256' as a jinja filter [DEBUG ] Marking 'sha512' as a jinja filter [DEBUG ] Marking 'hmac' as a jinja filter [DEBUG ] Marking 'hmac_compute' as a jinja filter [DEBUG ] Marking 'random_hash' as a jinja filter [DEBUG ] Marking 'rand_str' as a jinja filter [DEBUG ] Marking 'file_hashsum' as a jinja filter [DEBUG ] Marking 'http_query' as a jinja filter [DEBUG ] Marking 'ifelse' as a jinja global [DEBUG ] Marking 'strftime' as a jinja filter [DEBUG ] Marking 'date_format' as a jinja filter [DEBUG ] Marking 'raise' as a jinja global [DEBUG ] Marking 'match' as a jinja test [DEBUG ] Marking 'equalto' as a jinja test [DEBUG ] Marking 'skip' as a jinja filter [DEBUG ] Marking 'sequence' as a jinja filter [DEBUG ] Marking 'to_bool' as a jinja filter [DEBUG ] Marking 'indent' as a jinja filter [DEBUG ] Marking 'tojson' as a jinja filter [DEBUG ] Marking 'quote' as a jinja filter [DEBUG ] Marking 'regex_escape' as a jinja filter [DEBUG ] Marking 'regex_search' as a jinja filter [DEBUG ] Marking 'regex_match' as a jinja filter [DEBUG ] Marking 'regex_replace' as a jinja filter [DEBUG ] Marking 'uuid' as a jinja filter [DEBUG ] Marking 'unique' as a jinja filter [DEBUG ] Marking 'min' as a jinja filter [DEBUG ] Marking 'max' as a jinja filter [DEBUG ] Marking 'avg' as a jinja filter [DEBUG ] Marking 'union' as a jinja filter [DEBUG ] Marking 'intersect' as a jinja filter [DEBUG ] Marking 'difference' as a jinja filter [DEBUG ] Marking 'symmetric_difference' as a jinja filter [DEBUG ] Marking 'method_call' as a jinja filter [547/1943][DEBUG ] Marking 'yaml_dquote' as a jinja filter [DEBUG ] Marking 'yaml_squote' as a jinja filter [DEBUG ] Marking 'yaml_encode' as a jinja filter [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf [DEBUG ] Including configuration from '/etc/salt/master.d/syndic.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/syndic.conf [DEBUG ] Grains refresh requested. Refreshing grains. [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] The functions from module 'core' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'disks' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'extra' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'lvm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'mdadm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'minion_process' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'opts' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'package' are being loaded by dir() on the loaded module [DEBUG ] Override utils: <module 'salt.loaded.int.grains.zfs' from '/opt/saltstack/salt/lib/python3.10/site-packages/salt/grains/zfs.py'> [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded zfs.is_supported [DEBUG ] The functions from module 'jinja' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded jinja.render [DEBUG ] The functions from module 'yaml' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded yaml.render [DEBUG ] Using selector: EpollSelector [DEBUG ] Created pidfile: /var/run/salt-syndic.pid [INFO ] Starting up the Salt Syndic [DEBUG ] Syndic attempting to connect to xxxxxxxxxxxxxxxxx [DEBUG ] Grains refresh requested. Refreshing grains. [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] The functions from module 'core' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'disks' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'extra' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'lvm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'mdadm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'minion_process' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'opts' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'package' are being loaded by dir() on the loaded module [DEBUG ] Override utils: <module 'salt.loaded.int.grains.zfs' from '/opt/saltstack/salt/lib/python3.10/site-packages/salt/grains/zfs.py'> [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded zfs.is_supported [INFO ] Creating minion process manager [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf [DEBUG ] Including configuration from '/etc/salt/master.d/syndic.conf' [498/1943][DEBUG ] Reading configuration from /etc/salt/master.d/syndic.conf [DEBUG ] Grains refresh requested. Refreshing grains. [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] The functions from module 'core' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'disks' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'extra' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'lvm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'mdadm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'minion_process' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'opts' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'package' are being loaded by dir() on the loaded module [DEBUG ] Override utils: <module 'salt.loaded.int.grains.zfs' from '/opt/saltstack/salt/lib/python3.10/site-packages/salt/grains/zfs.py'> [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded zfs.is_supported [DEBUG ] The functions from module 'jinja' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded jinja.render [DEBUG ] The functions from module 'yaml' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded yaml.render [DEBUG ] Connecting to master. Attempt 1 of 1 [DEBUG ] "xxxxxxxxxxxxxxxxxxxx" Not an IP address? Assuming it is a hostname. [DEBUG ] Master URI: tcp://x.x.x.x:4506 [DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'xxxxxxxxxxxxxxxx', 'tcp://x.x.x.x:4506') [DEBUG ] Generated random reconnect delay between '2026ms' and '12026ms' (2026) [DEBUG ] Setting zmq_reconnect_ivl to '2026ms' [DEBUG ] Setting zmq_reconnect_ivl_max to '12026ms' [DEBUG ] salt.crypt.get_rsa_key: Loading private key [DEBUG ] salt.crypt._get_key_with_evict: Loading private key [DEBUG ] Loaded minion key: /etc/salt/pki/minion/minion.pem [DEBUG ] Syndic attempting to connect to xxxxxxxxxxxxxxxxxxx [DEBUG ] Grains refresh requested. Refreshing grains. [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] The functions from module 'core' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'disks' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'extra' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'lvm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'mdadm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'minion_process' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'opts' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'package' are being loaded by dir() on the loaded module [DEBUG ] Override utils: <module 'salt.loaded.int.grains.zfs' from '/opt/saltstack/salt/lib/python3.10/site-packages/salt/grains/zfs.py'> [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded zfs.is_supported [INFO ] Creating minion process manager [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf [DEBUG ] Including configuration from '/etc/salt/master.d/syndic.conf' [449/1943][DEBUG ] Reading configuration from /etc/salt/master.d/syndic.conf [DEBUG ] Grains refresh requested. Refreshing grains. [DEBUG ] Missing configuration file: /etc/salt/minion [DEBUG ] The functions from module 'core' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'disks' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'extra' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'lvm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'mdadm' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'minion_process' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'opts' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'package' are being loaded by dir() on the loaded module [DEBUG ] Override utils: <module 'salt.loaded.int.grains.zfs' from '/opt/saltstack/salt/lib/python3.10/site-packages/salt/grains/zfs.py'> [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] The functions from module 'zfs' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded zfs.is_supported [DEBUG ] The functions from module 'jinja' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded jinja.render [DEBUG ] The functions from module 'yaml' are being loaded by dir() on the loaded module [DEBUG ] LazyLoaded yaml.render [DEBUG ] Connecting to master. Attempt 1 of 1 [DEBUG ] "xxxxxxxxxxxxxxxx" Not an IP address? Assuming it is a hostname. [DEBUG ] Master URI: tcp://x.x.x.x:4506 [DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'xxxxxxxxxxxxxxxxxxxxxxxxx', 'tcp://x.x.x.x:4506') [DEBUG ] Generated random reconnect delay between '1136ms' and '11136ms' (1136) [DEBUG ] Setting zmq_reconnect_ivl to '1136ms' [DEBUG ] Setting zmq_reconnect_ivl_max to '11136ms' [DEBUG ] salt.crypt.get_rsa_key: Loading private key [DEBUG ] Loaded minion key: /etc/salt/pki/minion/minion.pem [DEBUG ] Reading configuration from /etc/salt/master [DEBUG ] Including configuration from '/etc/salt/master.d/f_defaults.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/f_defaults.conf [DEBUG ] Including configuration from '/etc/salt/master.d/syndic.conf' [DEBUG ] Reading configuration from /etc/salt/master.d/syndic.conf [DEBUG ] Missing configuration file: /root/.saltrc [DEBUG ] SyndicManager 'xxxxxxxxxxxxxxxxxxxxxx' trying to tune in [ERROR ] An un-handled exception was caught by Salt's global exception handler: AssertionError: Traceback (most recent call last): File "/usr/bin/salt-syndic", line 11, in sys.exit(salt_syndic()) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/scripts.py", line 401, in salt_syndic syndic.start() File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/cli/daemons.py", line 621, in start self.syndic.tune_in() File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", line 3719, in tune_in self.io_loop.add_future(future, self.reconnect_event_bus) File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/ioloop.py", line 688, in add_future assert is_future(future) AssertionError Traceback (most recent call last): File "/usr/bin/salt-syndic", line 11, in sys.exit(salt_syndic()) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/scripts.py", line 401, in salt_syndic syndic.start() File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/cli/daemons.py", line 621, in start self.syndic.tune_in() File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/minion.py", line 3719, in tune_in self.io_loop.add_future(future, self.reconnect_event_bus) File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/ioloop.py", line 688, in add_future assert is_future(future) AssertionError

iMikeG6 commented 7 months ago

Hi there,

I have the exact same error message, same version. 3007.0 seems to have a bug with tornado python module.

In file /opt/saltstack/salt/lib/python3.10/site-packages/tornado/ioloop.py withing function def add_future if you're commenting the following lines:

        else:
            assert is_future(future)
            # For concurrent futures, we use self.add_callback, so
            # it's fine if future_add_done_callback inlines that call.
            future_add_done_callback(
                future, lambda f: self.add_callback(callback, future)
            )

Then run

salt-syndic -l debug

It runs but with errors, which is expected due to the fact that the mentioned lines are commented

[ERROR   ] Future exception was never retrieved

I found a similar issue (assert is_future) with tornado: https://github.com/tornadoweb/tornado/issues/2753 However, could not figure out how to solve this problem.

markschuh commented 7 months ago

Agreed - your proposed change allows the salt-syndic process itself to start and to successfully connect to its syndic_master. So far so good. But the asynchronous loop commented out seems to handle the syndic_cmd callback. And without this there is no syndic functionality working in v3007.0. At least that is the result of my own tests.

So there seems to be more broken in the "syndic" functionality inside v3007.0 than just the salt-syndic startup. It looks to me, the issue is related to the following changes:

The second PR has changed the return value of set_event_handler() inside salt/utils/event.py

old: self.subscriber.on_recv(event_handler)
new:  self.io_loop.spawn_callback(self.subscriber.on_recv, event_handler)

The former return value was a Future. The new is a None. The set_event_handler function is not only used by the "SyndicManager" object, but also by the "MinionManager" object. Due to modernizing of the asynchronous code within the "MinionManager" the above change was introduced. But the "SyndicManager" and "SynDic" class do not seem to have gotten a similar redesign yet. This seems to be the currently missing part to me.

I have tried some fixes - but had no success yet,

Some more logs about the part inside salt-syndic, that is broken now:

# Lab Scenario:  3 docker containers started via compose - based on ubuntu/22.04 with salt v3007.0

salt-minion    --- (is minion-connected to) -->   salt-master  (is syndic-connected to) -->  salt-syndicmaster

The command run on salt-syndicmaster was:

salt '*' test.ping

The errors from "salt-syndic -l trace" - started via docker exec on salt-master, which is running the salt-master

[TRACE   ] Forwarding events
[TRACE   ] Decoding payload: {'enc': 'aes', 'load': ... }
[DEBUG   ] PubChannel received: {'enc': 'aes', 'load': {'fun': 'test.ping', 'arg': [], 'tgt': '*', 'jid': '20240406163620418445', 'ret': '', 'tgt_type': 'glob', 'to': 5, 'user': 'root', 'auth_list': []}, 'sig': ...} <bound method Syndic._process_cmd_socket of <salt.minion.Syndic object at 0x7f77bdd1a800>>
[TRACE   ] Handling payload
[TRACE   ] ReqChannel send clear load={'cmd': 'publish', 'tgt': '*', 'fun': 'test.ping', 'arg': [], 'key': 'iITndEIycmlR/wYHZD8aZ+DQNHBLqvOsoRGnK//emEuQ0xwkmCAUf58GJjXnkdExrritvb2OmqA=', 'tgt_type': 'glob', 'ret': '', 'jid': '20240406163620418445', 'kwargs': {'callback': <function Syndic.syndic_cmd.<locals>.<lambda> at 0x7f77bdd031c0>, 'auth_list': [], 'user': 'root'}, 'user': 'root'}
[ERROR   ] Exception while running callback
Traceback (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/zeromq.py", line 394, in consume
    await callback(msg)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/client.py", line 484, in wrap_callback
    await callback(decoded)
TypeError: object NoneType can't be used in 'await' expression
[TRACE   ] Failed to send msg TypeError("can not serialize 'function' object")
[TRACE   ] ReqChannel send clear load={'cmd': 'publish', 'tgt': '*', 'fun': 'test.ping', 'arg': [], 'key': 'iITndEIycmlR/wYHZD8aZ+DQNHBLqvOsoRGnK//emEuQ0xwkmCAUf58GJjXnkdExrritvb2OmqA=', 'tgt_type': 'glob', 'ret': '', 'jid': '20240406163620418445', 'kwargs': {'callback': <function Syndic.syndic_cmd.<locals>.<lambda> at 0x7f77bdd031c0>, 'auth_list': [], 'user': 'root'}, 'user': 'root'}
[TRACE   ] Failed to send msg TypeError("can not serialize 'function' object")
[DEBUG   ] Closing AsyncReqChannel instance
[ERROR   ] Future exception was never retrieved
future: <Future finished exception=TypeError("can not serialize 'function' object")>
Traceback (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 780, in run
    yielded = self.gen.throw(exc)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/client/__init__.py", line 2011, in pub_async
    payload = yield channel.send(payload_kwargs, timeout=timeout)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
    value = future.result()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 780, in run
    yielded = self.gen.throw(exc)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/client.py", line 337, in send
    ret = yield self._uncrypted_transfer(load, timeout=timeout)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
    value = future.result()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 780, in run
    yielded = self.gen.throw(exc)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/channel/client.py", line 309, in _uncrypted_transfer
    ret = yield self.transport.send(
  File "/opt/saltstack/salt/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
    value = future.result()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/zeromq.py", line 1147, in send
    return await asyncio.wait_for(self._send_recv(load), timeout=timeout)
  File "/opt/saltstack/salt/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/transport/zeromq.py", line 1128, in _send_recv
    message = salt.payload.dumps(message)
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/payload.py", line 172, in dumps
    return salt.utils.msgpack.packb(
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/msgpack.py", line 129, in packb
    return msgpack.packb(o, **_sanitize_msgpack_kwargs(kwargs))
  File "/opt/saltstack/salt/lib/python3.10/site-packages/msgpack/__init__.py", line 36, in packb
    return Packer(**kwargs).pack(o)
  File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
  File "msgpack/_packer.pyx", line 291, in msgpack._cmsgpack.Packer._pack
TypeError: can not serialize 'function' object
[TRACE   ] Forwarding events
...
markschuh commented 7 months ago

I was able to at least work around the above issues. See the patch at https://gist.github.com/markschuh/fab749f7d7940d1fc5eba0cb57b10885

But the syndic functionality is still missing somehow. The workaround is not complete - or my code may even have broken the step to forward job results - visible on the master - back to the syndic_master

In v3006.7 the trace log of salt-syndic had this pattern after salt command launch on the syndic_master:

...
[TRACE   ] Handling payload
[TRACE   ] ReqChannel send clear load={'cmd': 'publish', 'tgt': '*', 'fun': 'test.ping', 'arg': [], 'key': 'q25fdgM9/FJ00pCAb15iWJtfzLwLjlLOOpuOA8hx87fr+w2kVgZJQ2F2OGJaZDrnbGecyYtmn5Y=', 'tgt_type': 'glob', 'ret': '', 'jid': '20240408100443365920', 'kwargs': {'auth_list': [], 'user': 'root'}, 'user': 'root'}
[TRACE   ] Got event 20240408100443365920
[TRACE   ] Got event salt/job/20240408100443365920/new
[DEBUG   ] Closing AsyncReqChannel instance
[TRACE   ] Forwarding events
[DEBUG   ] Minion return retry timer set to 7 seconds (randomized)
[TRACE   ] ReqChannel send crypt load={'id': '0b771f21e5ee', 'cmd': '_minion_event', 'pretag': 'syndic/0b771f21e5ee', 'tok': ..., 'events': [{'data': {'minions': ['salt-minion'], '_stamp': '2024-04-08T10:04:43.421845'}, 'tag': '20240408100443365920'}, {'data': {'jid': '20240408100443365920', 'tgt_type': 'glob', 'tgt': '*', 'user': 'root', 'fun': 'test.ping', 'arg': [], 'minions': ['salt-minion'], 'missing': [], '_stamp': '2024-04-08T10:04:43.423589'}, 'tag': 'salt/job/20240408100443365920/new'}]}
[TRACE   ] Got event salt/job/20240408100443365920/ret/salt-minion
[DEBUG   ] The functions from module 'local_cache' are being loaded by dir() on the loaded module
[DEBUG   ] LazyLoaded local_cache.get_load
[DEBUG   ] Reading minion list from /var/cache/salt/master/jobs/3f/59e045a7eb165e4d8d2ad9acb5100cf9fe42985ba44e28b7486d13522a0135/.minions.p
[TRACE   ] Forwarding events
[DEBUG   ] Minion return retry timer set to 10 seconds (randomized)
[INFO    ] Returning information for job: 20240408100443365920
[TRACE   ] Depends decorator instantiated with dep list of ('non_existantmodulename',) and kwargs {}
[DEBUG   ] The functions from module 'test' are being loaded by dir() on the loaded module
[TRACE   ] Unloading test.missing_func because dependency (non_existantmodulename) is not met
[DEBUG   ] LazyLoaded test.ping
[TRACE   ] ReqChannel send crypt load={'cmd': '_syndic_return', 'load': [{'id': '0b771f21e5ee', 'jid': '20240408100443365920', 'fun': 'test.ping', 'arg': None, 'tgt': None, 'tgt_type': None, 'load': {'cmd': 'publish', 'tgt': '*', 'fun': 'test.ping', 'arg': [], 'tgt_type': 'glob', 'ret': '', 'jid': '20240408100443365920', 'kwargs': {'user': 'root'}, 'user': 'root', 'Minions': ['salt-minion']}, 'return': {'salt-minion': {'return': True, 'retcode': 0, 'success': True, 'fun_args': []}}}]}
[TRACE   ] ret_val = <salt.ext.tornado.concurrent.Future object at 0x7fbc7d7c5780>
[TRACE   ] Forwarding events
...

In v3007.0 the related log output of salt-syndic -l trace - with the patch from the gist applied - looks different:

...
[TRACE   ] Handling payload
[TRACE   ] ReqChannel send clear load={'cmd': 'publish', 'tgt': '*', 'fun': 'test.ping', 'arg': [], 'key': '+B9G54z+4lYhyRBBCs+ngGz45saV154y9qCzhffPZgKE1UxsC0ZeOp3/h+2azwPQhTKb6G/xLt4=', 'tgt_type': 'glob', 'ret': '', 'jid': '20240408112152050926', 'kwargs': {'auth_list': [], 'user': 'root'}, 'user': 'root'}
[DEBUG   ] PubClient connecting to <salt.transport.tcp.PublishClient object at 0x7f1c4d5bbd00> '/var/run/salt/master/master_event_pub.ipc'
[DEBUG   ] PubClient conencted to <salt.transport.tcp.PublishClient object at 0x7f1c4d5bbd00> '/var/run/salt/master/master_event_pub.ipc'
[DEBUG   ] Closing AsyncReqChannel instance
[TRACE   ] Forwarding events
...

Part of the difference may be caused by differences in the logging statements. Because the events on master - returned by salt-run state.event show that the related events are generated as needed. But the salt-syndic does not seem to see all the events as needed!?

20240408112152050926    {"minions": ["salt-minion"], "_stamp": "2024-04-08T11:21:52.132061"}
salt/job/20240408112152050926/new       {"jid": "20240408112152050926", "tgt_type": "glob", "tgt": "*", "user": "root", "fun": "test.ping", "arg": [], "minions": ["salt-minion"], "missing": [], "_stamp": "2024-04-08T11:21:52.134245"}
salt/job/20240408112152050926/publish   {"fun": "test.ping", "arg": [], "tgt": "*", "jid": "20240408112152050926", "ret": "", "tgt_type": "glob", "user": "root", "_stamp": "2024-04-08T11:21:52.137787"}
salt/job/20240408112152050926/ret/salt-minion   {"cmd": "_return", "id": "salt-minion", "success": true, "return": true, "retcode": 0, "jid": "20240408112152050926", "fun": "test.ping", "fun_args": [], "user": "root", "_stamp": "2024-04-08T11:21:52.313702"}
markschuh commented 7 months ago

Finally I could identify a quiet small overall patch which seems to restore the salt-syndic functionality within v3007.0 as needed. I have updated https://gist.github.com/markschuh/fab749f7d7940d1fc5eba0cb57b10885 to revision 5, which contains my latest patch.

Before I open this as PR I'll need to first do more tests as part of the patch touches code, that is used by salt-minion as well.

Edited: I have removed changes in salt/transport/tcp.py and salt/channel/client.py because the related ERRRORs do not block syndic functionality. This is handled by fixes for #66177 instead.