Closed gsanchietti closed 5 years ago
Test case
Before update:
[root@ns76-ent html]# curl localhost & time systemctl restart httpd
[1] 8475
curl: (52) Empty reply from server
[1]+ Exit 52 curl localhost
real 1m30.196s
user 0m0.020s
sys 0m0.008s
After update:
[root@ns76-ent html]# curl localhost & time systemctl restart httpd
[1] 10527
curl: (52) Empty reply from server
[1]+ Exit 52 curl localhost
real 0m5.201s
user 0m0.012s
sys 0m0.003s
FTR:
[root@ns76-ent html]# cat /var/www/html/index.php
<?php
for ($x = 0; $x <= 100; $x++) {
echo "The number is: $x <br>";
sleep (1);
}
?>
Tested in production, no visible problems. Note that the 5 secs timeout has been reached.
[root@nethservice ~]# time systemctl restart httpd
real 0m5.437s
user 0m0.002s
sys 0m0.008s
QA
# rpm -qa nethserver-httpd
nethserver-httpd-3.4.0-1.4.g7622a30.ns7.noarch
[root@ns7loc15 ~]# systemctl cat httpd.service
......
# /etc/systemd/system/httpd.service.d/quick_kill.conf
[Service]
TimeoutStopSec=5
The patch has modified the time to wait before to restart the service, before it could take a long time when you have several app like webtop and sogo installed on the same server, after the patch the server restart httpd really quickly
before the testing rpm
[root@ns7loc15 ~]# time systemctl restart httpd
real 1m30.587s
user 0m0.046s
sys 0m0.040s
after the testing rpm
[root@ns7loc15 ~]# time systemctl restart httpd
real 0m2.465s
user 0m0.020s
sys 0m0.067s
[root@ns7loc15 ~]# time systemctl restart httpd
real 0m4.403s
user 0m0.047s
sys 0m0.060s
I did not succeed to get a an error during the restart of httpd with webtop5, but I did succeed to get errors with SOGo, so obviously when httpd needs 90 seconds to restart, the httpd service is unavailable and SOGo also.
after the upgrade, all users can login to webtop or SOGo
proposed verified
Nowadays there are many web applications using websocket and long-living connections. When a web application (eg. Nextcloud, WebTop) is updated, usually the httpd daemon is restarted.
On a production server with many connected users, most web applications remain unresponsive for around 1.5 minutes. This is due to how the httpd is restarted.
The systemd unit shipped with CentOS (
/usr/lib/systemd/system/httpd.service
) sends a WINCH signal to the httpd daemon on restart. As soon the Apache grab the signal, it doesn't accept any new connection and wait indefinitely until all remaining requests have been fully served (https://httpd.apache.org/docs/current/mod/mpm_common.html#gracefulshutdowntimeout).Finally, if httpd has not been stopped after
DefaultTimeoutStopSec
seconds (default is 90s inside/etc/systemd/system.conf
) systemd will send SIGCONT instead of SIGTERM.Proposed solution
Change the default
TimeoutStopSec
for the httpd systemd unit to have a faster restart at the cost of losing some existing connections.Most of modern web applications can handle re-connection and most of the old web application can handle a slow server response which can take around 5 or 10 seconds.