enterprisemediawiki / meza

Setup an enterprise MediaWiki server with simple commands
MIT License
41 stars 27 forks source link

Apache processes consuming full CPU load #865

Open darenwelsh opened 7 years ago

darenwelsh commented 7 years ago

Environment

Issue details

At 3:00 PM local time (Central Time Zone), one of our server administrators reported that our production VM was running at 100% CPU load (across all four CPUs). Per top, there were several apache processes running that were consuming the CPU load. So I ran sudo apachectl restart. This seemed to solve the issue, but after 5-10 minutes, there were several apache processes running again and consuming the CPU load.

The server admin noted that yum pushed package updates this morning. The following is from /var/log/yum.log:

Sep 05 05:15:24 Updated: systemd-libs.x86_64 219-42.el7_4.1
Sep 05 05:15:25 Updated: libvirt-libs.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:25 Updated: qemu-img.x86_64 10:1.5.3-141.el7_4.2
Sep 05 05:15:25 Updated: seabios-bin.noarch 1.10.2-3.el7_4.1
Sep 05 05:15:25 Updated: seavgabios-bin.noarch 1.10.2-3.el7_4.1
Sep 05 05:15:25 Updated: ncurses-base.noarch 5.9-14.20130511.el7_4
Sep 05 05:15:25 Updated: ncurses-libs.x86_64 5.9-14.20130511.el7_4
Sep 05 05:15:25 Updated: bash.x86_64 4.2.46-29.el7_4
Sep 05 05:15:26 Updated: openssh.x86_64 7.4p1-12.el7_4
Sep 05 05:15:34 Updated: selinux-policy.noarch 3.13.1-166.el7_4.4
Sep 05 05:15:35 Updated: binutils.x86_64 2.25.1-32.base.el7_4.1
Sep 05 05:15:35 Updated: cpio.x86_64 2.11-25.el7_4
Sep 05 05:15:36 Updated: kmod.x86_64 20-15.el7_4.2
Sep 05 05:15:39 Installed: kernel.x86_64 3.10.0-693.2.1.el7
Sep 05 05:15:39 Updated: nss-softokn-freebl.x86_64 3.28.3-8.el7_4
Sep 05 05:15:39 Updated: ncurses.x86_64 5.9-14.20130511.el7_4
Sep 05 05:15:39 Updated: libvirt-client.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:39 Updated: libsss_nss_idmap.x86_64 1.15.2-50.el7_4.2
Sep 05 05:15:39 Updated: libsss_idmap.x86_64 1.15.2-50.el7_4.2
Sep 05 05:15:40 Updated: ipxe-roms-qemu.noarch 20170123-1.git4e85b27.el7_4.1
Sep 05 05:15:40 Updated: kernel-tools-libs.x86_64 3.10.0-693.2.1.el7
Sep 05 05:15:40 Updated: kmod-libs.x86_64 20-15.el7_4.2
Sep 05 05:15:42 Updated: systemd.x86_64 219-42.el7_4.1
Sep 05 05:15:42 Updated: samba-common.noarch 4.6.2-10.el7_4
Sep 05 05:15:42 Updated: libwbclient.x86_64 4.6.2-10.el7_4
Sep 05 05:15:43 Updated: samba-client-libs.x86_64 4.6.2-10.el7_4
Sep 05 05:15:43 Updated: samba-common-libs.x86_64 4.6.2-10.el7_4
Sep 05 05:15:43 Updated: libsmbclient.x86_64 4.6.2-10.el7_4
Sep 05 05:15:43 Updated: systemd-sysv.x86_64 219-42.el7_4.1
Sep 05 05:15:43 Updated: libvirt-daemon.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-core.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-network.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-nwfilter.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-qemu.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-interface.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-secret.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-nodedev.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-config-nwfilter.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-lxc.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-config-network.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-gluster.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-scsi.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-iscsi.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-disk.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-logical.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-rbd.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage-mpath.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:43 Updated: libvirt-daemon-driver-storage.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:44 Updated: qemu-kvm-common.x86_64 10:1.5.3-141.el7_4.2
Sep 05 05:15:44 Updated: qemu-kvm.x86_64 10:1.5.3-141.el7_4.2
Sep 05 05:15:44 Updated: libvirt-daemon-kvm.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:44 Updated: libguestfs.x86_64 1:1.36.3-6.el7_4.3
Sep 05 05:15:44 Updated: libvirt.x86_64 3.2.0-14.el7_4.3
Sep 05 05:15:44 Updated: samba-client.x86_64 4.6.2-10.el7_4
Sep 05 05:15:44 Updated: systemd-python.x86_64 219-42.el7_4.1
Sep 05 05:15:45 Updated: openssh-server.x86_64 7.4p1-12.el7_4
Sep 05 05:15:45 Updated: kernel-tools.x86_64 3.10.0-693.2.1.el7
Sep 05 05:15:45 Updated: sssd-client.x86_64 1.15.2-50.el7_4.2
Sep 05 05:15:45 Updated: nss-softokn.x86_64 3.28.3-8.el7_4
Sep 05 05:16:03 Updated: selinux-policy-devel.noarch 3.13.1-166.el7_4.4
Sep 05 05:16:15 Updated: selinux-policy-targeted.noarch 3.13.1-166.el7_4.4
Sep 05 05:16:15 Updated: openssh-clients.x86_64 7.4p1-12.el7_4
Sep 05 05:16:15 Updated: scl-utils.x86_64 20130529-18.el7_4
Sep 05 05:16:15 Updated: sudo.x86_64 1.8.19p2-11.el7_4
Sep 05 05:16:16 Updated: ncurses-devel.x86_64 5.9-14.20130511.el7_4
Sep 05 05:16:31 Installed: kernel-devel.x86_64 3.10.0-693.2.1.el7
Sep 05 05:16:32 Updated: ghostscript.x86_64 9.07-28.el7_4.2
Sep 05 05:16:32 Updated: libgudev1.x86_64 219-42.el7_4.1
Sep 05 05:16:32 Updated: http-parser.x86_64 2.7.1-5.el7_4
Sep 05 05:16:32 Updated: python-perf.x86_64 3.10.0-693.2.1.el7
Sep 05 05:16:33 Updated: kernel-headers.x86_64 3.10.0-693.2.1.el7
Sep 05 05:16:33 Updated: nss-softokn-freebl.i686 3.28.3-8.el7_4
Sep 05 05:16:33 Updated: systemd-libs.i686 219-42.el7_4.1
darenwelsh commented 7 years ago

/var/log/httpd/error_log has a bunch of Warning Interned string buffer overflow messages, but these begin long before the CPU usage issue that started today.

darenwelsh commented 7 years ago

Per @freephile

try <?php $wgMemoryLimit = 500000000; //Default is 50M. This is 500M in /opt/conf-meza/public/postLocalSettings.d/overrides.php to cure those interned string buffer overflow messages

I guess we can move that to a separate issue.

freephile commented 7 years ago

And increasing memory may not actually be the correct way to solve those Interned string messages. See https://superuser.com/questions/969386/php-5-6-what-does-opcache-interned-string-buffers-overflow-mean It looks like there are some php.ini settings which control the amount of memory in the APC opcache. Also, a link on that SU page shows a simple repo from Rasmus Ledorf for creating a status page for the APC cache. (it used to be included in the PEAR package, but now it's a separate file).

darenwelsh commented 7 years ago

Somewhere around 4:25 PM local time, the issue "resolved itself" - apache processes went back to normal usage.

freephile commented 7 years ago

wow, I was going to suggest using # lsof to perhaps see what might be the issue. Check the size of the Apache logs. If a log file grew to be extremely large due to some type of attack, maybe it was creating the CPU problem? Then if the log was rotated at 4:25, problem goes away.

darenwelsh commented 7 years ago

per @jamesmontalvo3 this may be related to #850 and we may need to manually uninstall the http-parser that may have been installed.

freephile commented 7 years ago

After further investigation into the Apache configuration of Meza (there is none, related to performance characteristics, so defaults are used), I filed Issue #867 which is very likely to be the source of the observed issue here. My suggestion would be to review https://discourse.equality-tech.com/t/how-do-i-optimize-apache/115 and use the given calculations, plus perl "Shortcut", to determine suitable configuration for your environment, then test with Apache Bench, or similar tool. The resolution to #867 would be to add Apache tuning to Meza.