Closed Ch3LL closed 5 years ago
I manually tested this on centos7 to see if we were seeing the same thing even though our automated tests were passing. What i found after upgrading pycryptodomex from 3.4 to 3.6 was that there was now 3 salt-api processes showing up but it never becomes a defunct process. salt-api still continues to work. also when i try to restart the service it correclty kills the processes. I dont need to manually kill the service and can use the correct service manager to restart.
The problem also highlights itself if pycryptodomex is removed. A dangling child process of the original salt-api process is left behind. The same is happening with Salt 2018.3.1 on both Centos 7 & 6. Salt-api is going to use the pycryptodomex version still in memory, even after yum erases it, and hence will not start using the basic pycrypto until it is restarted.
Requires further examination of salt-api and how it handles upgrade and removal of underlying packages that it is using.
@Ch3LL The problem is due to the use of cryptographic package not being hot-plug capable in Salt. That is, adding a preferred cryptographic package, does not immediately imply its usage, similarly it's removal (parts of its removal are detected and a new user of it can be re-spun leading to additional processes.
It is best after the addition or removal of a cryptographic package, to restart all Salt packages which are currently installed and active, for example: salt-minion, salt-master, salt-api. This is similar to changing a configuration parameter in a config file and having to restart the Salt component which utilizes the config file.
Preferred order of cryptographic packages utilized by Salt: M2Crypto pycryptodomex pycrypto
If the preferred cryptographic package is unavailable, the next in the list is tried. Note: pycrypto is a required dependency, that is, at a minimum pycrypto must be available for Salt to install.
With removal of python2-pycryptodomex, CherryPy monitors system modules, see https://github.com/cherrypy/cherrypy/blob/master/cherrypy/process/plugins.py#L689-L690
which results in the following occurring (from systemd journal output) Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Restarting because /usr/lib64/python2.7/site-packages/Cryptodome/IO/PEM.py changed. Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Stopped thread 'Autoreloader'. Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Bus STOPPING Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('0.0.0.0', 8000)) shut down Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Stopped thread '_TimeoutMonitor'. Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Bus STOPPED Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Bus EXITING Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Bus EXITED Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Waiting for child threads to terminate... Jun 20 11:43:00 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:00] ENGINE Re-spawning /usr/bin/salt-api Jun 20 11:43:01 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:01] ENGINE Listening for SIGHUP. Jun 20 11:43:01 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:01] ENGINE Listening for SIGTERM. Jun 20 11:43:01 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:01] ENGINE Listening for SIGUSR1. Jun 20 11:43:01 localhost.localdomain salt-api[10721]: [20/Jun/2018:11:43:01] ENGINE Bus STARTING
Hence the previous recommendation of restarting salt-xxxx components after adding or removing a cryptographic package which Salt utilizes.
@Ch3LL With this information, can this be closed and the doc's updated to reflect the need to restart the Salt components
Note the code in CherryPy is not limited to Cryptodome since it does a sys.modules.items() as demonstrated on an upgrade from 2017.7.6 to 20183.1 and ldap.py is noted as changed, from the systemd journal output:
398 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Restarting because /usr/lib/python2.7/site-packages/salt/auth/ldap.py changed. 399 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Stopped thread 'Autoreloader'. 400 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Bus STOPPING 401 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('0.0.0.0', 8000)) shut down 402 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Stopped thread '_TimeoutMonitor'. 403 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Bus STOPPED 404 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Bus EXITING 405 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Bus EXITED 406 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Waiting for child threads to terminate... 407 Jun 20 14:32:07 localhost.localdomain salt-api[21440]: [20/Jun/2018:14:32:07] ENGINE Re-spawning /usr/bin/salt-api
This is going to make any hot-plug changes interesting, in that how many other packages have such code which monitors what is used and detecting changes. Note, that the install of Salt packages for packages that are already running causes a restart to be performed to pickup any configuration changes, hence we are back to two salt-api processes, in this instance.
root@localhost:~# ps -ef | grep salt-api root 24436 1 0 14:32 ? 00:00:00 /usr/bin/python /usr/bin/salt-api root 24612 24436 0 14:32 ? 00:00:00 /usr/bin/python /usr/bin/salt-api root 25848 23810 0 14:33 pts/3 00:00:00 grep --color=auto salt-api root@localhost:~#
My concern isn't that you need to restart the service. I understand that requirement. My concern was the process becoming defunct on cent6 as it does not occur on cent7. This requires manually stopping the process with a killsignal. Restarting the service with service salt-api restart
does not work as the defunct process stays around.
zombie process. Guess we need to generate a list of packages with Salt where if you are going to install them, then kindly stop salt's running processes, then upgrade, then restart stopped running processes, or install packages xyz before installing salt.
Best analogy is , cannot change the engine while driving down the freeway.
@Ch3LL Is this still a concern or can we close it, since we don't do hot-plug, hence correct usage is shutdown, then start
When using centos6 setup with salt-api using the optional
python2-pycryptodomex
package, if you upgrade or remove the pycryptodomex package a defunct salt-api process shows up.Replication Steps:
yum install python2-pycryptodomex
yum upgrade python2-pycryptodomex
WORKAROUND: manually kill the salt-api process. Then restart salt-api process and everything works.