Performance Tuning Meza (MySQL crashes)

Meza can be unstable, even with 2GB of RAM, unless configured properly

In our staging environment on AWS, using t2.small instances, our DB server tends to crash. The simple reason for this is that Meza does not configure Apache for performance, and the defaults are far too high for a small server. (Apache ends up using all the RAM, and the OS ends up killing MySQL)

Environment

Machine or Virtual Machine details: AWS t2.small instance as control node with a second t2.small node as both app-server and db-server (ea. with 2GB of RAM)
Operating System: CentOS Linux release 7.4.1708 (Core)
meza version hash: 2b28877 (es128-rebased branch @ https://github.com/freephile/meza)

Note: I do NOT recommend using t2.small instances at all, and find much better pound for pound performance at Digital Ocean compared to AWS.

clear ; w ; echo '' ; df -h ; echo '' ; free -m ; echo '' ; service httpd status ; service mysqld status

13:40:13 up 66 days, 8:20, 2 users, load average: 0.01, 0.04, 0.05 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT centos pts/0 10.0.0.210 Tue21 5.00s 0.34s 0.33s screen -dr centos pts/2 :pts/0:S.0 Tue21 5.00s 0.06s 0.86s SCREEN

Filesystem Size Used Avail Use% Mounted on /dev/xvdf1 100G 30G 71G 30% / devtmpfs 900M 0 900M 0% /dev tmpfs 920M 0 920M 0% /dev/shm tmpfs 920M 108M 812M 12% /run tmpfs 920M 0 920M 0% /sys/fs/cgroup tmpfs 184M 0 184M 0% /run/user/1004 tmpfs 184M 0 184M 0% /run/user/1000 tmpfs 184M 0 184M 0% /run/user/1001 10.0.50.161:/gluster 50G 25G 26G 50% /opt/data-meza/uploads-gluster
          total        used        free      shared  buff/cache   available
Mem: 1838 999 373 152 464 499 Swap: 0 0 0 /bin/systemctl status httpd.service ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2017-10-17 21:45:42 UTC; 1 day 15h ago Docs: man:httpd(8) man:apachectl(8) Main PID: 30623 (httpd) Status: "Total requests: 8178; Current requests/sec: 0.1; Current traffic: 0 B/sec" CGroup: /system.slice/httpd.service ├─28257 /usr/sbin/httpd -DFOREGROUND ├─28260 /usr/sbin/httpd -DFOREGROUND ├─28263 /usr/sbin/httpd -DFOREGROUND ├─28483 /usr/sbin/httpd -DFOREGROUND ├─28484 /usr/sbin/httpd -DFOREGROUND ├─28485 /usr/sbin/httpd -DFOREGROUND ├─30623 /usr/sbin/httpd -DFOREGROUND ├─30625 /usr/sbin/httpd -DFOREGROUND ├─30626 /usr/sbin/httpd -DFOREGROUND ├─30627 /usr/sbin/httpd -DFOREGROUND └─30629 /usr/sbin/httpd -DFOREGROUND

Oct 17 21:45:42 ip-10-0-50-161.ec2.internal systemd[1]: Starting The Apache HTTP Server... Oct 17 21:45:42 ip-10-0-50-161.ec2.internal httpd[30623]: [Tue Oct 17 21:45:42.656812 2017] [so:warn] [pid 30623] AH01574: module php5_module is already loaded, skipping Oct 17 21:45:42 ip-10-0-50-161.ec2.internal systemd[1]: Started The Apache HTTP Server. /bin/systemctl status mysqld.service ● mysqld.service - MySQL Community Server Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2017-10-17 12:54:31 UTC; 2 days ago Main PID: 21965 (mysqld_safe) CGroup: /system.slice/mysqld.service ├─21965 /bin/sh /usr/bin/mysqld_safe --basedir=/usr └─22132 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/my...

Oct 17 12:54:29 ip-10-0-50-161.ec2.internal systemd[1]: Starting MySQL Community Server... Oct 17 12:54:30 ip-10-0-50-161.ec2.internal mysqld_safe[21965]: 171017 12:54:30 mysqld_safe Logging to '/var/log/mysqld.log'. Oct 17 12:54:30 ip-10-0-50-161.ec2.internal mysqld_safe[21965]: 171017 12:54:30 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql Oct 17 12:54:31 ip-10-0-50-161.ec2.internal systemd[1]: Started MySQL Community Server.

Issue details

As described here, MySQL can crash when the system is starved of memory. It's not Meza's fault. It's not MySQL's fault. It's just a symptom of an overloaded machine. Apache hogs too much memory and the Linux kernel's OOM process reaper sacrifices MySQL to keep the system up. This is most likely to happen if you're forced to deploy in a monolith (everything on one node); or "mixed" environment (where your DB server is also an app-server.)

In Meza, the default innodb_buffer_pool size is 256M - which is hardly 'too big'. It's just that depending on the size of the environment you're trying to run on, this may be in fact "too big for you". From src/roles/database/defaults/main.yml mysql_innodb_buffer_pool_size: "256M"

You can change this value, or you can override it with a '.cnf' file using the mysql_config_include_files Meza option.

You can alleviate some issues if you make sure that the database server is dedicated to being ONLY a database server in Meza.

On the other hand, you can limit Apache to take up less memory if you're forced to deploy in a monolith or "mixed" environment (meaning your DB server is also an app-server). I've created a discussion thread over at https://discourse.equality-tech.com/t/how-do-i-optimize-apache/115 to focus on the performance tuning of Apache for Meza

Not related to MySQL or Apache; or even system stability, but if you run into memory errors during play execution, you can set the "forks" option in Ansible to 1 by either editing the system-wide config in /etc/ansible/ansible.cfg, or the Meza-specific config at /opt/meza/config/core/ansible.cfg

Finally, if Meza were to support or convert to running on NginX, then it would likely have lower memory requirements.

enterprisemediawiki / meza