Nov 14 01:59:00 dev ssm-mysql-queries[2424367]: 2024/11/14 01:59:00.003148 WARNING qan-interval stat /var/lib/mysql/slow.log: no such file or directory
And that in turn breaks query harvesting because the base slow.log no longer exists (but rotated files do.
Oddly, flush logs in mysql shell doesn't seem to get the slow.log re-created.
This seems to happen under extremely high load, slow logs get significantly oversized, they should be rotated after each GB:
-rw-rw----. 1 mysql mysql 4.7G Nov 14 01:50 /var/lib/mysql/slow.log-1731541839
-rw-rw----. 1 mysql mysql 5.0G Nov 14 02:10 /var/lib/mysql/slow.log-1731543030
For testing, I generated the load using:
sysbench oltp_read_write --table-size=1000000 --threads=32 --time=86400 --db-driver=mysql --mysql-db=test run
This seems to happen with MariaDB 10.5.27.
With 10.6.20, it seems to produce a lot of this:
[root@dev /var/log]# tail -f messages | grep qan
Nov 14 02:30:54 dev ssm-mysql-queries[2970254]: 2024/11/14 02:30:54.004858 WARNING qan-analyzer-mysql-9f9521f8-worker Timeout parsing /var/lib/mysql/slow.log 152433581-915701237: 72.2% 660966394/915701237 763267656 54.0s
Nov 14 02:32:00 dev ssm-mysql-queries[2970254]: 2024/11/14 02:32:00.007361 WARNING qan-analyzer-mysql-9f9521f8 Skipping interval '6 /var/lib/mysql/slow.log 2024-11-14 00:31:00 UTC to 2024-11-14 00:32:00 UTC (0-124364369)' because interval '5 /var/lib/mysql/slow.log 2024-11-14 00:30:00 UTC to 2024-11-14 00:31:00 UTC (915701237-1678061887)' is still being parsed
Nov 14 02:33:00 dev ssm-mysql-queries[2970254]: 2024/11/14 02:33:00.004998 WARNING qan-analyzer-mysql-9f9521f8 Skipping interval '7 /var/lib/mysql/slow.log 2024-11-14 00:32:00 UTC to 2024-11-14 00:33:00 UTC (124364369-891628602)' because interval '5 /var/lib/mysql/slow.log 2024-11-14 00:30:00 UTC to 2024-11-14 00:31:00 UTC (915701237-1678061887)' is still being parsed
Nov 14 02:34:00 dev ssm-mysql-queries[2970254]: 2024/11/14 02:34:00.003638 WARNING qan-analyzer-mysql-9f9521f8 Skipping interval '8 /var/lib/mysql/slow.log 2024-11-14 00:33:00 UTC to 2024-11-14 00:34:00 UTC (891628602-1662080490)' because interval '5 /var/lib/mysql/slow.log 2024-11-14 00:30:00 UTC to 2024-11-14 00:31:00 UTC (915701237-1678061887)' is still being parsed
Nov 14 02:35:00 dev ssm-mysql-queries[2970254]: 2024/11/14 02:35:00.007736 WARNING qan-analyzer-mysql-9f9521f8 Skipping interval '9 /var/lib/mysql/slow.log 2024-11-14 00:34:00 UTC to 2024-11-14 00:35:00 UTC (1662080490-2419821122)' because interval '5 /var/lib/mysql/slow.log 2024-11-14 00:30:00 UTC to 2024-11-14 00:31:00 UTC (915701237-1678061887)' is still being parsed
but it looks like when it times out, it doesn't upload any data. It looks like once it times out, it never uploads anything after that. Like a status check gets stuck, but the harvesting aborts. That "interval 5" stays there forever.
Under high load, log rotation goes wrong:
And that in turn breaks query harvesting because the base slow.log no longer exists (but rotated files do. Oddly,
flush logs
in mysql shell doesn't seem to get the slow.log re-created.This seems to happen under extremely high load, slow logs get significantly oversized, they should be rotated after each GB:
For testing, I generated the load using:
This seems to happen with MariaDB 10.5.27. With 10.6.20, it seems to produce a lot of this:
but it looks like when it times out, it doesn't upload any data. It looks like once it times out, it never uploads anything after that. Like a status check gets stuck, but the harvesting aborts. That "interval 5" stays there forever.
Tested with ssm-client 9.4.1-11
Note: Special handling of slow log rotation is required until this bug is fixed: https://jira.mariadb.org/browse/MDEV-35444