saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
13.98k stars 5.47k forks source link

[BUG] [3007.0] Using IPv6 causes TCP PublishServer to crash #66567

Open AppCrashExpress opened 1 month ago

AppCrashExpress commented 1 month ago

Description

Hello!

We are currently trying to update Saltstack to 3007.0.

Both our masters and minions are configured to use IPv6, but this causes a crash in the PublishServer class of the TCP transport upon initialization of master.

This issue was tested on the commit ID: 31c9d0df191009207c72ea73abfd3a1e3a0e6425

Setup This is shortened configuration, since we use patched custom installation and I'm not sure what I'm allowed to show, but it should be sufficient given the nature of the issue:

log_level: INFO
log_level_logfile: INFO

interface: '::'
ipv6: true
auto_accept: true
transport: tcp

Should you find it insufficient, please let me know.

Steps to Reproduce the behavior Simply start salt-master daemon with IPv6 enabled. After a while a message will be passed to logs akin to:

2024-05-18 10:38:12,208 [tornado.application:279 ][ERROR   ][59555] Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x7f5118807ce0>>, <Task finished name='Task-1' coro=<PublishServer.publisher() done, defined at /opt/salt/transport/tcp.py:1384> exception=gaierror(-2, 'Name or service not known')>)
Traceback (most recent call last):
  File "tornado/tornado-6/tornado/ioloop.py", line 750, in _run_callback
    ret = callback()
          ^^^^^^^^^^
  File "tornado/tornado-6/tornado/ioloop.py", line 774, in _discard_future_result
    future.result()
  File "/opt/salt/transport/tcp.py", line 1420, in publisher
    sock.bind((self.pub_host, self.pub_port))
socket.gaierror: [Errno -2] Name or service not known

Started publisher process, in this case 59555, will stay as a zombie process until restart.

Expected behavior

Publisher should start correctly.

Screenshots Probably inapplicable.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml Salt Version: Salt: 3007.0 Python Version: Python: 3.12.3 (main, May 13 2024, 10:19:24) [Clang 16.0.6 ] Dependency Versions: cffi: 1.16.0 cherrypy: Not Installed dateutil: 2.9.0.post0 docker-py: 7.0.0 gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.4 libgit2: Not Installed looseversion: 1.3.0 M2Crypto: 0.38.0 Mako: 1.3.3 msgpack: 1.0.8 msgpack-pure: Not Installed mysql-python: 1.4.6 packaging: 21.3 pycparser: 2.22 pycrypto: Not Installed pycryptodome: Not Installed pygit2: Not Installed python-gnupg: 0.5.2 PyYAML: 5.4.1 PyZMQ: 25.1.2 relenv: Not Installed smmap: 5.0.1 timelib: 0.3.0 Tornado: 6.4 ZMQ: 4.1.2 Salt Package Information: Package Type: Not Installed System Versions: dist: ubuntu 22.04 jammy locale: utf-8 machine: x86_64 release: 5.15.0-30-generic system: Linux version: Ubuntu 22.04 jammy ```

Additional context

CLI part of master, as seen here: https://github.com/saltstack/salt/blob/v3007.0/salt/cli/daemons.py#L198, wraps configured IPv6 interface in brackets. In configuration above it takes interface, ::, and turns it to [::].

This, if I am not mistaken, bubbles up the stack to self.pub_host in PublishServer. It tries to bind to address, wrapped in brackets, but this is not allowed in getaddrinfo, which causes publisher process to crash with gaierror.

Easiest fix is to wrap self.pub_host in salt.utils.network.ip_bracket with strip = True option. Which is what we did:

-            sock.bind((self.pub_host, self.pub_port))
+            sock.bind((ip_bracket(self.pub_host, strip=True), self.pub_port))
welcome[bot] commented 1 month ago

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey. Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar. If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!