Open smarsching opened 1 month ago
duplicate of #66567?
@dwoz That bug might be related, but I do not think that it is the same bug.
interface
to ::
and setting transport
to tcp
, while this bug happens when interface
and transport
are not set explicitly.The error message also is slightly different:
For #66567, it is gaierror(-2, 'Name or service not known')
, while for this bug it is gaierror(11001, 'getaddrinfo failed')
.
If I understand the code in ipc_publish_server()
correctly, #66567 should not be triggered when creating the IPC service (the interface
option is not used for that and the address 127.0.0.1
is hard-coded instead). I believe that #66567 rather happens when salt.transport.base.publish_server()
is called and transport
is tcp
.
Description There is a regresion in Salt 3007 (both 3007.0 and 3007.1) that causes the minion to not start correctly when IPv6 is enabled (
ipv6: true
is set in the options).Setup
Steps to Reproduce the behavior
Add a configuration file
C:\ProgramData\Salt Project\Salt\conf\minion.d\minion-custom-config.conf
with the following line:Then restart the Salt minion. Observe the following error message in the minion log (and that the minion is not reachable from the master):
Expected behavior The minion should start without error, like it does when using version 3006.8 with the same configuration.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml Salt Version: Salt: 3007.1 Python Version: Python: 3.10.14 (heads/main:c1ec015, Apr 3 2024, 21:36:37) [MSC v.1938 64 bit (AMD64)] Dependency Versions: cffi: 1.16.0 cherrypy: 18.8.0 dateutil: 2.8.2 docker-py: Not Installed gitdb: 4.0.10 gitpython: Not Installed Jinja2: 3.1.4 libgit2: Not Installed looseversion: 1.3.0 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.7 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 23.1 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.19.1 pygit2: Not Installed python-gnupg: 0.5.2 PyYAML: 6.0.1 PyZMQ: 25.1.2 relenv: 0.16.0 smmap: 5.0.1 timelib: 0.3.0 Tornado: 6.3.3 ZMQ: 4.3.4 Salt Package Information: Package Type: onedir System Versions: dist: locale: utf-8 machine: AMD64 release: 2022Server system: Windows version: 2022Server 10.0.20348 SP0 Multiprocessor Free ```Additional context By adding some debugging code, I found out that the problem is caused by trying to bind a socket using the
AF_INET6
address family to the IP address127.0.0.1
. Windows does not allow binding IPv6 socket to IPv4 addresses.@dwoz, @garethgreenaway I suspect that this bug was introduced in 6320f769ea8, which you authored:
Before, the IPC publisher was created in
salt.minion.MinionManager._bind()
by callingsalt.utils.event.AsyncEventPublisher()
, which delegated tosalt.transport.ipc.IPCMessagePublisher
.Now, it is created by calling
salt.transport.ipc_publish_server()
, which indirectly delegates tosalt.transport.tcp.PublishServer
.The code in
salt.transport.ipc.IPCMessagePublisher
always creates a socket usingAF_INET
, while the code insalt.transport.tcp.PublishServer
usesAF_INET6
whenopts['ipv6']
isTrue
. On Windows, however, a socket created withAF_INET6
cannot be bound to an IPv4 address.I can see three possible fixes for this:
salt.minion.MinonManager._bind()
back to usingsalt.utils.event.AsyncEventPublisher()
.salt.transport.tcp.PublishServer
to accept an additional flag that enforcesAF_INET
, even ifopts['ipv6']
is set, and pass this flag fromsalt.transport.base.ipc_publish_server()
.ipc_publish_server()
to use::1
instead of127.0.0.1
whenopts['ipv6']
is set. However, it might be necessary to also change this in other places (where the client socket that connects to this server is created).The first fix looks like the simplest one, but I assume that you did this refactoring for a good reason, so it might not be desirable.
The second one makes the code somewhat more complex, while the third one is pretty straight forward but might necessitate changes in other places.
As I don not know the reasoning behind the refactoring, I cannot assess which of the three options is the most reasonable one, so I would appreciate your input.