Closed jouvin closed 10 years ago
The delayed processing has been broken by the fact that is_executable
method in CAF::Process
runs the command... Thus the delayed processing is configured after the end of the command...
if ( $p->is_executable() ) {
# Delay processing of some signals
delay_signals();
Clearly need at least a workaround in 14.8.0 final release... If the problem is difficult to solve in CAF
whe can make the change in signal handling independent of the fact that we can execute the command...
Forgot my previous comment... is_executable()
is not the problem and doing just what is expected!
I reviewed the code doing the signal handling and could not found anything wrong. The previous commit logs actions related to delayed signal processing at info level. This should help troubleshoot this problem and is a better information for site admins anyway.
I suggest merging this in 14.8.0 and going on with the release. The problem itself is harmless with the ncm-spma
fix and the improved logging will help to troubleshoot the problem... at next Quattor update after the release (next release RC).
More debugging will be analysed during RC cycle.
After detailed analysis of logs during 14.10.0-rc2, I confirm that everything works as expected. The uncompleted YUM transactions cannot be seen anymore but this is probably due to the fact that ncm-spma
runs yum-complete-transaction
. Anyway, with delayed signal processing, there should be no uncompleted transaction left...
There are as many restarts of ncm-ncd
as there are RPM scripts doing a ncm-cdispd
restart... Currently we have two: ncm-cdispd
and ncm-cdp-listend
. There is not much that can be done as the new ncm-cdispd
process is started immediately, before completion of the existing one, and waits for the first one to complete before really running the components (ncm-ncd
lock). Apart the fact that this is a bit surprising when you look at the logs, this is harmless (as long as configuration module are idempotent... but they need to be by design!) and avoid a more complex strategy to handle properly TERM
signal.
I'm in favor of closing this issue, if nobody objects it.
Closing after 14.10.0-RC3... Reopen if seen again or if any sign of a hidden problem...
After deployment of a new Quattor version, there are 2 uncomplete YUM transaction left in 14.8.0-rc5 (https://github.com/quattor/release/issues/54). This means that delayed signal processing does not work as expected, at least during the upgrade process. It was well tested when added in July but a race condition may be present...
This is harmless as we run
yum-complete-transaction
as part of SPMA, as long as https://github.com/quattor/configuration-modules-core/issues/294 is merged.This issue probably explains #14.