saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Install Salt from the Salt package repositories here:
https://docs.saltproject.io/salt/install-guide/en/latest/
Apache License 2.0
14.19k stars 5.48k forks source link

[BUG] refreshing beacons leaks inotify handles #58907

Open golmaal opened 4 years ago

golmaal commented 4 years ago

Description Setup an inotify beacon and refresh beacons. Every time beacons are refreshed inotify file handles are leaked.

Setup Start a CentOS:8.1.1911 container

curl -o bootstrap-salt.sh -L https://bootstrap.saltstack.com
./bootstrap-salt.sh -P -M git v3002.1
pip3 install pyinotify pytz
salt-call beacons.add inotify '{"files":["/tmp"],"mask":"create"}'

Steps to Reproduce the behavior

salt-call state.highstate

OR

salt-call saltutil.refresh_beacons 

Check open file handles using lsof -

lsof | grep -E 'salt-mi.*inotify'

Expected behavior A constant number of inotify handles should remain open.

[root@test2 salt]# lsof | grep -E 'salt-mi.*inotify'
salt-mini 2172                root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2173 salt-mini root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2206 salt-mini root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2207 salt-mini root   27r  a_inode               0,14        0      13151 notify

However, after refreshing beacons, it keeps on increasing -

[root@test2 salt]# salt-call saltutil.refresh_beacons
local:
    True
[root@test2 salt]# lsof | grep -E 'salt-mi.*inotify'
salt-mini 2172                root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172                root   28r  a_inode               0,14        0      13151 inotify
salt-mini 2172                root   29r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2173 salt-mini root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2173 salt-mini root   28r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2173 salt-mini root   29r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2206 salt-mini root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2206 salt-mini root   28r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2206 salt-mini root   29r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2207 salt-mini root   27r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2207 salt-mini root   28r  a_inode               0,14        0      13151 inotify
salt-mini 2172 2207 salt-mini root   29r  a_inode               0,14        0      13151 inotify
[root@test2 salt]# salt --versions-report
Salt Version:
           Salt: 3002.1

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
         Jinja2: 2.11.2
        libgit2: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 1.0.0
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: Not Installed
   pycryptodome: 3.9.9
         pygit2: Not Installed
         Python: 3.6.8 (default, Apr 16 2020, 01:36:27)
   python-gnupg: Not Installed
         PyYAML: 5.3.1
          PyZMQ: 19.0.2
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.3.2

System Versions:
           dist: centos 8 Core
         locale: UTF-8
        machine: x86_64
        release: 4.18.0-193.19.1.el8_2.x86_64
         system: Linux
        version: CentOS Linux 8 Core
garethgreenaway commented 3 years ago

@golmaal Thanks for the report.

cheburazavrik commented 2 years ago

Hello, I got this after upgrading from 3001.8 to 3004. The problem happens when I have configured beacon and run the highstate.

salt-call -V
Salt Version:
          Salt: 3004

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.5.3
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.9.4
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
  python-gnupg: Not Installed
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.1

System Versions:
          dist: debian 9 stretch
        locale: UTF-8
       machine: x86_64
       release: 4.9.0-12-amd64
        system: Linux
       version: Debian GNU/Linux 9 stretch

My beacon configuration:

beacons:
    inotify:
        - files:
             /etc/shadow:
                 mask:
                   - modify
        - interval: 300
        - coalesce: True 
eliasp commented 2 years ago

This seems to be a bug of pyinotify: https://github.com/seb-m/pyinotify/issues/192

The original project seems to be unmaintained since ~2015 and while there's a little better maintained fork of PyInotify available, pypi distributes the unmaintained original and at least Ubuntu's package is also based on this one.

I don't really know, whether the fork dsoprea/PyInotify fixes the mentioned issue, so this would have to be investigated/tested.

rolszewo commented 2 years ago

Having the same problem with Salt 3002 on Ubuntu:

salt-call -V
Salt Version:
          Salt: 3002.8

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.7.3
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: 2.6.1
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.8.10 (default, Mar 15 2022, 12:22:08)
  python-gnupg: 0.4.5
        PyYAML: 5.3.1
         PyZMQ: 18.1.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.2

System Versions:
          dist: ubuntu 20.04 focal
        locale: utf-8
       machine: x86_64
       release: 5.4.0-104-generic
        system: Linux
       version: Ubuntu 20.04 focal

Configured a beacon like this:

beacons:
  file_renew:
    - beacon_module: inotify
    - disable_during_state_run: True
    - interval: 5
    - files:
        /apps/test/file1:
          mask:
            - close_write
          recurse: False
          auto_add: False
          coalesce: True
        /apps/test/file2:
          mask:
            - close_write
          recurse: False
          auto_add: False
          coalesce: True

After restart of minion i have 1 inotify-handle. Then, over time, this inotify-handles increase.

After 1 day the minion has about 128 inotify-handles - which is the default-maxium of max_user_instances:

cat /proc/sys/fs/inotify/max_user_instances
128

for i in /proc/*/fd/*; do readlink -f $i; done | grep inotify | sort | uniq -c | sort -nr
    128 /proc/3521660/fd/anon_inode:inotify
      3 /proc/3612872/fd/anon_inode:inotify
      3 /proc/1/fd/anon_inode:inotify
      2 /proc/884/fd/anon_inode:inotify
      2 /proc/866/fd/anon_inode:inotify
      1 /proc/916/fd/anon_inode:inotify
      1 /proc/880/fd/anon_inode:inotify
      1 /proc/853/fd/anon_inode:inotify
      1 /proc/848/fd/anon_inode:inotify
      1 /proc/2824542/fd/anon_inode:inotify
      1 /proc/2824437/fd/anon_inode:inotify
      1 /proc/2824294/fd/anon_inode:inotify

Then this messages appear in the logs until i restart the salt-minion:

May 16 03:45:58 minionid salt-minion[2069191]: [CRITICAL] The beacon errored:
May 16 03:45:58 minionid salt-minion[2069191]: Traceback (most recent call last):
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/minion.py", line 2878, in handle_beacons
May 16 03:45:58 minionid salt-minion[2069191]:     beacons = self.process_beacons(self.functions)
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/minion.py", line 515, in process_beacons
May 16 03:45:58 minionid salt-minion[2069191]:     return self.beacons.process(
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/__init__.py", line 130, in process
May 16 03:45:58 minionid salt-minion[2069191]:     raw = self.beacons[fun_str](b_config[mod])
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/inotify.py", line 270, in beacon
May 16 03:45:58 minionid salt-minion[2069191]:     notifier = _get_notifier(_config)
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/salt/beacons/inotify.py", line 78, in _get_notifier
May 16 03:45:58 minionid salt-minion[2069191]:     wm = pyinotify.WatchManager()
May 16 03:45:58 minionid salt-minion[2069191]:   File "/usr/lib/python3/dist-packages/pyinotify.py", line 1764, in __init__
May 16 03:45:58 minionid salt-minion[2069191]:     raise OSError(err % self._inotify_wrapper.str_errno())
May 16 03:45:58 minionid salt-minion[2069191]: OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)

installed pyinotify is:

dpkg -l|grep pyinotify
ii  python-pyinotify                      0.9.6-1.2ubuntu1                   all          simple Linux inotify Python bindings
ii  python3-pyinotify                     0.9.6-1.2ubuntu1                   all          simple Linux inotify Python bindings
mrj1m0thy commented 2 years ago

Any updates on this issue? This issue makes beacons effectively unusable. Once the open files is maxed out the server will constantly throw errors and services start to fail. I've already had multiple salt-minion services crash because of this issue causing our operations team some sleepless nights.

cheburazavrik commented 2 years ago

Any updates on this issue? At this point, I had to find ugly and quick workarounds to replace beacons. Having this fixed, or at least getting a confirmation that this will not get fixed, I can invest resources to properly replace beacons.

BT-dschleich commented 1 year ago

Any updates yet on this issue? This is still hitting us and we need to workaround the ever growing inotify handle needs of the salt minion

QuinoRay commented 6 months ago

Are there any updates on this issue? Using Salt version 3006.7 and python version 3.10.13 the problem persists. The solution so far is to add a task to the cron job to restart the salt-minion service, but I don't think this is a good solution.