saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.2k stars 5.48k forks source link

minion.restart module doesn't work when minion run via systemd #46255

Open dmarkwat opened 6 years ago

dmarkwat commented 6 years ago

Description of Issue/Question

This linked line is where the trip-up seems to happen. When the minion is started with systemd, the '-d' flag is not passed on sys.argv to the minion. Thus, the else condition removes restart on systemd-based machines. This however does work on sysvinit-based setups (RHEL 6 in my case) as -d is passed to salt in sys.argv.

Setup

Bare CentOS/RHEL 7 setup with salt-minion installed via yum and started via systemctl should be sufficient.

Steps to Reproduce Issue

On master: salt 'rhel7-minion' minion.restart Output looks something like:

rhel7-minion:
----------
    comment:
        - Not running in daemon mode - will not restart process after killing
    killed:
        7347
    restart:
        ----------
    retcode:
        0

The comment line is traced back to this line here

Versions Report

Master

Salt Version:
           Salt: 2017.7.4

Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: 0.21.0
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: 0.21.4
         Python: 2.7.5 (default, May  3 2017, 07:55:04)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: redhat 7.4 Maipo
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.11.6.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.4 Maipo

Minion

Salt Version:
           Salt: 2017.7.4

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: Not Installed
         Python: 2.7.5 (default, May  3 2017, 07:55:04)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: redhat 7.4 Maipo
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.11.6.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.4 Maipo
gtmanfred commented 6 years ago

Starting with 2017.7.3, with KillMode=process in the systemd service unit, you can use the service.restart unit.

Systemd is much more strict about this, and tracks the cgroup, and does not recommend running daemons in Type=forking, since systemd handles all the forking.

srchulo commented 5 years ago

I'm seeing a similar issue with restart, but also with stop. I see this:

systemd[1]: Stopping My Mojolicious application workers

But the worker doesn't seem to stop unless I kill it. I do have KillMode=process in my file.

wolfpackmars2 commented 5 years ago

Should warn if process will not restart without killing the process.

Will try the service.restart method. salt.modules.service is under-documented.

https://docs.saltstack.com/en/latest/ref/modules/all/salt.modules.service.html

# salt 's*' minion.restart
spartanapp:
    ----------
    comment:
        - Not running in daemon mode - will not restart process after killing
    killed:
        22641
    restart:
        ----------
    retcode:
        0
stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

cruscio commented 4 years ago

@Ch3LL , could this be reopened?

I just did a minion.restart on Ubuntu 18.04 and Salt 3000 - all my minions shut down with Not running in daemon mode - will not restart process after killing

There may be an available workaround/alternative, but I'd submit that this is still a bug, and harmful behavior that should be fixed (or at least prevented).

ITJamie commented 2 years ago

salt knows it can not restart the minion process. it should not stop the minion in this case and instead return a failure. not kill the process AND return a failure

gvfnix commented 2 years ago

Works for me with the following conditions:

CrackerJackMack commented 9 months ago

Issue present in 3006.6. basic bootstrap install. onedir with systemd.

master

Salt Version:
          Salt: 3006.6

Python Version:
        Python: 3.10.13 (main, Nov 15 2023, 04:34:27) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.14.6
      cherrypy: unknown
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.3
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.14.2
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: ubuntu 20.04.5 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-132-generic
        system: Linux
       version: Ubuntu 20.04.5 focal

minion

Salt Version:
          Salt: 3006.6

Python Version:
        Python: 3.10.13 (main, Nov 15 2023, 04:34:27) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.14.6
      cherrypy: 18.6.1
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.3
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.14.2
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: ubuntu 22.04.1 jammy
        locale: utf-8
       machine: x86_64
       release: 5.15.0-53-generic
        system: Linux
       version: Ubuntu 22.04.1 jammy
CrackerJackMack commented 9 months ago

Adding minion_restart_command kinda worked, but not entirely smooth.

minion setup prior to test

root@node-0:~# cat /etc/salt/minion.d/restart.conf
minion_restart_command: systemctl restart salt-minion
root@node-0:~# date -u
Sun Feb  4 10:13:48 PM UTC 2024
root@node-0:~# systemctl restart salt-minion
root@node-0:~# systemctl status salt-minion
● salt-minion.service - The Salt Minion
     Loaded: loaded (/lib/systemd/system/salt-minion.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2024-02-04 22:13:53 UTC; 5s ago
       Docs: man:salt-minion(1)
             file:///usr/share/doc/salt/html/contents.html
             https://docs.saltproject.io/en/latest/contents.html
   Main PID: 15065 (python3.10)
      Tasks: 7 (limit: 2160)
     Memory: 57.2M
        CPU: 417ms
     CGroup: /system.slice/salt-minion.service
             ├─15065 /opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion
             └─15073 "/opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion MultiMinionProcessManager MinionProcessManager"

Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Setting up the Salt Minion "node-0"
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Starting up the Salt Minion
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Starting pull socket on /var/run/salt/minion/minion_event_7c6cc41e6b_pull.ipc
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Creating minion process manager
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Executing command date in directory '/root'
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Updating job settings for scheduled job: __mine_interval
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Added mine.update to scheduler
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Minion is starting as user 'root'
Feb 04 22:13:53 node-0 salt-minion[15073]: [INFO    ] Minion is ready to receive requests!
Feb 04 22:13:54 node-0 salt-minion[15073]: [INFO    ] Running scheduled job: __mine_interval with jid 20240204221354956795

master

root@salt-master:~# date -u
Sun 04 Feb 2024 10:14:31 PM UTC
root@salt-master:~# salt node-0 test.ping
node-0:
    True
root@salt-master:~# salt node-0 minion.restart
node-0:
    Minion did not return. [No response]
    The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:

    salt-run jobs.lookup_jid 20240204221438860367
ERROR: Minions returned with non-zero exit code
root@salt-master:~# date -u
Sun 04 Feb 2024 10:14:55 PM UTC
root@salt-master:~# salt node-0 test.ping
node-0:
    True
root@salt-master:~# date -u
Sun 04 Feb 2024 10:15:00 PM UTC

minion systems status

root@node-0:~# date -u
Sun Feb  4 10:15:05 PM UTC 2024
root@node-0:~# systemctl status salt-minion
● salt-minion.service - The Salt Minion
     Loaded: loaded (/lib/systemd/system/salt-minion.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2024-02-04 22:14:41 UTC; 25s ago
       Docs: man:salt-minion(1)
             file:///usr/share/doc/salt/html/contents.html
             https://docs.saltproject.io/en/latest/contents.html
   Main PID: 15150 (python3.10)
      Tasks: 7 (limit: 2160)
     Memory: 57.3M
        CPU: 502ms
     CGroup: /system.slice/salt-minion.service
             ├─15150 /opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion
             └─15159 "/opt/saltstack/salt/bin/python3.10 /usr/bin/salt-minion MultiMinionProcessManager MinionProcessManager"

Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Added mine.update to scheduler
Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Minion is starting as user 'root'
Feb 04 22:14:42 node-0 salt-minion[15159]: [INFO    ] Minion is ready to receive requests!
Feb 04 22:14:43 node-0 salt-minion[15159]: [INFO    ] Running scheduled job: __mine_interval with jid 20240204221443518611
Feb 04 22:14:43 node-0 salt-minion[15159]: [INFO    ] User sudo_vagrant Executing command saltutil.find_job with jid 20240204221443970560
Feb 04 22:14:44 node-0 salt-minion[15229]: [INFO    ] Starting a new job 20240204221443970560 with PID 15229
Feb 04 22:14:44 node-0 salt-minion[15229]: [INFO    ] Returning information for job: 20240204221443970560
Feb 04 22:14:58 node-0 salt-minion[15159]: [INFO    ] User sudo_vagrant Executing command test.ping with jid 20240204221458868531
Feb 04 22:14:58 node-0 salt-minion[15232]: [INFO    ] Starting a new job 20240204221458868531 with PID 15232
Feb 04 22:14:58 node-0 salt-minion[15232]: [INFO    ] Returning information for job: 20240204221458868531
CrackerJackMack commented 9 months ago

Update on my findings. This is the safest and most reliable way I've found to restart a linux minion. Using systemd-run --scope creates a new parent cGroup outside of init. This is similar to what is already implemented in systemd_service, but for some reason service.restart salt-minion, never results in the minion being restarted for me. minion.restart works with this. minion_restart_command: systemd-run --scope systemctl restart salt-minion

tjyang commented 1 week ago

Until this issue got assigned and fixed. I am using atd approach shared by Max Arnold: See details at https://salt.tips/upgrading-salt-to-python-3/

tjyang commented 12 hours ago

Update my finding from testing 3006.9 minion and master using onedir pkgs. minion.restart did restart salt-minion and can do "salt-call test.ping" with master. But master can't test.ping minion unless minion got restart again using "systemctl restart salt-minion".