bitnami / vms

Bitnami VMs
https://bitnami.com
Other
200 stars 44 forks source link

MariaDB Temp files are filling EBS drive and crashing the server #1557

Closed tooliedotter closed 3 hours ago

tooliedotter commented 1 month ago

Platform

AWS

bndiagnostic ID know more about bndiagnostic ID

no ID generated

bndiagnostic output

[Sun Jun 02 03:42:55.046587 2024] [proxy_fcgi:error] [pid 876:tid 139920932611840] (70007)The timeout specified has expired: [client ip_address:58232] AH01075: Error dispatching request to : (polling) [Sun Jun 02 03:43:05.442584 2024] [proxy_fcgi:error] [pid 981:tid 139921385817856] (70007)The timeout specified has expired: [client ip_address:32424] AH01075: Error dispatching request to : (polling) Press [Enter] to continue:

2024-06-02 3:01:14 3713 [Warning] Aborted connection 3713 to db: 'gcbjoomla' user: 'root' host: 'localhost' (Got an error writing communication packets)

The diagnostic bundle file was successfully created, but the automatic upload to Bitnami servers failed. You will need to upload it to your Bitnami Support ticket manually.

bndiagnostic was not useful. Could you please tell us why?

These are symptoms, not the cause.

Describe your issue as much as you can

My client has an R5a.large instance with 2 vCPUs, 16GB of memory, and 60GB of EBS space. Her site is currently using about 40GB. There is a query that's running which writes to temporary files on disk, and those temp files are filling up the remaining 20GB of space and crashing the server because the disk is full. Here are the applicable log entries leading up to the crash.

2024-06-02 0:59:40 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2024-06-02 0:59:40 0 [Note] InnoDB: Number of pools: 1 2024-06-02 0:59:40 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2024-06-02 0:59:40 0 [Note] InnoDB: Using Linux native AIO 2024-06-02 0:59:40 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728 2024-06-02 0:59:41 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-02 0:59:41 0 [Note] InnoDB: 128 rollback segments are active. 2024-06-02 0:59:41 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2024-06-02 0:59:41 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2024-06-02 0:59:41 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2024-06-02 0:59:41 0 [Note] InnoDB: 10.6.7 started; log sequence number 321018076631; transaction id 1391989636 2024-06-02 0:59:41 0 [Note] InnoDB: Loading buffer pool(s) from /bitnami/mariadb/data/ib_buffer_pool 2024-06-02 0:59:41 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-06-02 0:59:41 0 [Note] Server socket created on IP: '127.0.0.1'. 2024-06-02 0:59:41 0 [Warning] 'proxies_priv' entry '@% root@ip-10-0-0-247' ignored in --skip-name-resolve mode. 2024-06-02 0:59:41 0 [Note] /opt/bitnami/mariadb/sbin/mysqld: ready for connections. Version: '10.6.7-MariaDB' socket: '/opt/bitnami/mariadb/tmp/mysql.sock' port: 3306 Source distribution 2024-06-02 0:59:51 0 [Note] InnoDB: Buffer pool(s) load completed at 240602 0:59:51 2024-06-02 1:02:46 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2024-06-02 1:02:46 0 [Note] InnoDB: Number of pools: 1 2024-06-02 1:02:46 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2024-06-02 1:02:46 0 [Note] InnoDB: Using Linux native AIO 2024-06-02 1:02:46 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728 2024-06-02 1:02:46 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-02 1:02:46 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=321018076692,321018152991 2024-06-02 1:02:46 0 [Note] InnoDB: Starting final batch to recover 442 pages from redo log. 2024-06-02 1:02:46 0 [Note] InnoDB: 128 rollback segments are active. 2024-06-02 1:02:46 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1" 2024-06-02 1:02:46 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2024-06-02 1:02:46 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2024-06-02 1:02:46 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2024-06-02 1:02:46 0 [Note] InnoDB: 10.6.7 started; log sequence number 321021321859; transaction id 1391996281 2024-06-02 1:02:46 0 [Note] InnoDB: Loading buffer pool(s) from /bitnami/mariadb/data/ib_buffer_pool 2024-06-02 1:02:46 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-06-02 1:02:46 0 [Note] Server socket created on IP: '127.0.0.1'. 2024-06-02 1:02:46 0 [Warning] 'proxies_priv' entry '@% root@ip-10-0-0-247' ignored in --skip-name-resolve mode. 2024-06-02 1:02:46 0 [Note] /opt/bitnami/mariadb/sbin/mysqld: ready for connections. Version: '10.6.7-MariaDB' socket: '/opt/bitnami/mariadb/tmp/mysql.sock' port: 3306 Source distribution 2024-06-02 1:02:57 0 [Note] InnoDB: Buffer pool(s) load completed at 240602 1:02:57 2024-06-02 2:31:38 12152 [Warning] mysqld: Disk is full writing '/opt/bitnami/mariadb/tmp/#sql-temptable-bb3-2fbe-1b.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space) 2024-06-02 2:31:38 12152 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs

The last 2 entries repeat several times. Notice that it takes only 90 minutes to fill the 20GB of space. In Cloudwatch I see the query write up to 4 GB of files in anywhere from 5 to 15 minutes.

Here is the query I also found in the log. We are running Joomla v3.x (soon to migrate to Joomla 5).

Sort aborted, host: localhost, user: root, thread: 14014, query: SELECT a.id, a.title, a.alias, a.introtext, a.fulltext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, CASE WHEN c.published = 2 AND a.state > 0 THEN 2 WHEN c.published != 1 THEN 0 ELSE a.state END as state, CASE WHEN a.modified = '0000-00-00 00:00:00' THEN a.created ELSE a.modified END as modified, a.modified_by, uam.name as modified_by_name, CASE WHEN a.publish_up = '0000-00-00 00:00:00' THEN a.created ELSE a.publish_up END as publish_up,a.publish_down, a.images, a.urls, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.language, LENGTH(a.fulltext) AS readmore, a.ordering,c.title AS category_title, c.path AS category_route, c.access AS category_access, c.alias AS category_alias,c.published, c.published AS parents_published, c.lft, CASE WHEN a.created_by_alias > ' ' THEN a.created_by_alias ELSE ua.name END AS author,ua.email AS author_email,parent.title as parent

I cannot tell where this query is coming from; I'm still researching that.

QUESTION: Is there a setting I can add to MariaDB to limit the on-disk size of temporary files so that it doesn't keep crashing the server?

EDIT: gongomgra removed the link

tooliedotter commented 1 month ago

Please remove the zip file.

gongomgra commented 4 weeks ago

Hi @tooliedotter,

According to the bndiagnostic information you provided, the biggest directory in your installation is Apache (23 GB), while MariaDB is "only" 1.3 GB, but I don't know what is using that much space. We had issues in the past with MariaDB binary logs, but they are disabled in your server.

I'm afraid I don't know what can be the issue here. We recommend you to open a new question in the official Joomla forums, where people with more knowledge on using the application may give you better tips on what can be causing these issues (if it is caused by a plugin or a theme for example)

-----------------------------------
Find biggest directories in installdir
-----------------------------------
Running: du -h . -d 1 | sort -h
In: /opt/bitnami

Output:

28K ./apps
92K ./var
564K    ./scripts
956K    ./peclapcu
3.0M    ./stats
4.3M    ./varnish
7.6M    ./gonit
9.2M    ./common
22M ./git
55M ./phpmyadmin
65M ./letsencrypt
84M ./bndiagnostic
106M    ./nami
121M    ./bncert
131M    ./php
1.3G    ./mariadb
23G ./apache
25G .
tooliedotter commented 3 weeks ago

Thanks for the info. I'm getting a lot of blank stares at the Joomla forum so far, along with suggestions such as "A Date/Time value with all Zero is not a valid Date. It should have been NULL instead" along with the suggestion that I could change the SQL_MODE to accept invalid dates. I'd rather not go down that path.

Did you see anything else in the Log entries above that caught your attention, such as:

I trust the way your company puts together your stacks so I haven't messed with these options. We're desperate to fix this issue; I've had to reboot the AWS server FIVE times today because the hard drive (50GB) filled up.

Thoughts?

gongomgra commented 3 weeks ago

Hi @tooliedotter,

I'm afraid I don't know what can be happening, but I can think of two suggestions here:

https://docs.bitnami.com/aws/apps/wordpress/troubleshooting/deny-connections-bots-apache/

github-actions[bot] commented 6 days ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

tooliedotter commented 6 days ago

I have tracked down the location of the query that is assembled and insists on running, but I do not know why it kicks off and continues to run until it crashes the server by filling the EBS. Here's where the query is created, and it's a core Joomla file.

/opt/bitnami/apache/htdocs/_[domain.com]_/components/com_content/models/article.php

You can see the construction of the query there. Any ideas as to what might be causing this query to run?

I'm not the only one who has experienced this issue; here are a couple of examples.

gongomgra commented 5 days ago

@tooliedotter I'm afraid this seems to be an internal issue from Joomla itself and its interaction with the database (according to the links you shared). We recommend you to find help in a more specialized forum. Sorry for the inconveniences

gongomgra commented 5 days ago

@tooliedotter I can't find the path you mentioned in the latest release of Joomla (version 4.4.5). Maybe your issue is caused by the version of Joomla you are using?

https://github.com/joomla/joomla-cms/tree/4.4.5/components/com_content

github-actions[bot] commented 4 hours ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.