saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.19k stars 5.48k forks source link

[BUG] is auth_safemode working don't see any restart #66367

Open amalaguti opened 7 months ago

amalaguti commented 7 months ago

Description Trying to see inf the minion is restarted by auth_safemode: True but don't see any indication of minion restart The minion Windows 3006.1 is configured with

ping_interval: 1
auth_safemode: True
master: 
  - 172.21.0.10
  - 172.21.0.11

I disconnect the first master, wait for the ping_interval to occur, but don't see any service.restart of the salt-minion Windows service

https://github.com/saltstack/salt/blob/master/salt/minion.py#L3181

        # schedule the stuff that runs every interval
        ping_interval = self.opts.get("ping_interval", 0) * 60
        if ping_interval > 0 and self.connected:

            def ping_master():
                try:

                    def ping_timeout_handler(*_):
                        if self.opts.get("auth_safemode", False):
                            log.error(
                                "** Master Ping failed. Attempting to restart minion**"
                            )
                            delay = self.opts.get("random_reauth_delay", 5)
                            log.info("delaying random_reauth_delay %ss", delay)
                            try:
                                self.functions["service.restart"](service_name())
                            except KeyError:
                                # Probably no init system (running in docker?)
                                log.warning(
                                    "ping_interval reached without response "
                                    "from the master, but service.restart "
                                    "could not be run to restart the minion "
                                    "daemon. ping_interval requires that the "
                                    "minion is running under an init system."
                                )

                    self._fire_master(
                        "ping",
                        "minion_ping",
                        sync=False,
                        timeout_handler=ping_timeout_handler,
                    )

This exception is shown, not sure if it has a direct relationship with the ping_interval/auth_safemode or just the minion complaining one of the masters is not reachable

2024-04-15 21:43:43,957 [salt.channel.client                                                      :32  ][TRACE   ][96] Failed to send msg SaltReqTimeoutError('Message timed out')
2024-04-15 21:43:43,957 [tornado.application                                                      :353 ][ERROR   ][96] Future <salt.ext.tornado.concurrent.Future object at 0x00000210D8CC5C30> exception was never retrieved: Traceback (most recent call last):
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\minion.py", line 2700, in handle_event
    yield _minion.req_channel.send(
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\channel\client.py", line 295, in send
    ret = yield self._crypted_transfer(load, timeout=timeout, raw=raw)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\channel\client.py", line 252, in _crypted_transfer
    ret = yield _do_transfer()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\channel\client.py", line 233, in _do_transfer
    data = yield self.transport.send(
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\transport\zeromq.py", line 916, in send
    ret = yield self.message_client.send(load, timeout=timeout)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1064, in run
    yielded = self.gen.throw(*exc_info)
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\transport\zeromq.py", line 626, in send
    recv = yield future
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\gen.py", line 1056, in run
    value = future.result()
  File "C:\Program Files\Salt Project\Salt\lib\site-packages\salt\ext\tornado\concurrent.py", line 249, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
salt.exceptions.SaltReqTimeoutError: Message timed out

Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

Steps to Reproduce the behavior (Include debug logs if possible and relevant)

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml PASTE HERE ```

Additional context Add any other context about the problem here.

amalaguti commented 7 months ago

@dwoz in case you take a look to this, I think this commit is related https://github.com/saltstack/salt/commit/5edd2259d8e1b0a708d63529958b12c255672920