MariaDB starts crash recovery after every restart

nupplaphil commented 1 year ago

I'm using the Docker image for MariaDB a long time. And I'm facing this issue for the last 1/2 year I guess. After unsuccessfully playing around with the Docker settings and the performance-settings of MariaDB itself for this range of time, I'm now here :-) tbh I still don't know if there's something wrong with my setup or with MariaDB itself.

The setup:

Hetzner root server (128 GB RAM, i7-8700, 2x 1t NVMe)
I'm using Docker for every service I provide, there's no other virtualization
There are a handful tiny services (cloud, mail, ...) for family business
And two services for ~100 active users per day each (Friendica)

And this issue occurs at the two "power-services". The DB size is 33GB and 108GB

My docker-compose file:

version: '2.1'
services:

  db:
    image: mariadb
    volumes:
      - mysql-vol-1:/var/lib/mysql/
      - ./config/db/performance.cnf:/etc/mysql/conf.d/performance.cnf
    environment:
      - MYSQL_USER_FILE=/run/secrets/mysql_user
      - MYSQL_PASSWORD_FILE=/run/secrets/mysql_password
      - MYSQL_DATABASE=/run/secrets/mysql_database
      - MYSQL_ROOT_PASSWORD_FILE=/run/secrets/mysql_root_password
    restart: always
    secrets:
      - mysql_database
      - mysql_user
      - mysql_password
      - mysql_root_password
    tmpfs:
      - /tmp
    mem_limit: 32g
    memswap_limit: 32g
    mem_reservation: 4g
    ulimits:
      nofile:
        soft: 102400
        hard: 102400
      nproc:
        soft: 102400
        hard: 102400
      memlock: -1

  exporter:
    image: prom/mysqld-exporter
    command:
      - --collect.info_schema.processlist
      - --collect.info_schema.innodb_metrics
      - --collect.info_schema.tablestats
      - --collect.info_schema.tables
      - --collect.info_schema.userstats
      - --collect.engine_innodb_status
    environment:
      - "DATA_SOURCE_NAME=XXXXX:XXXX@(db:3306)/"
    networks:
      default:
        aliases:
          - exporter
      prom:
        aliases:
          - friendicame-mariadb-exporter

My performance.cnf: for the 33GB database:

[mysqld]
performance_schema = ON

innodb_buffer_pool_size = 32G
#innodb_log_buffer_size = 32M
innodb_log_file_size = 8G
#innodb_log_files_in_group = 2
#innodb_io_capacity = 3000
innodb_read_io_threads = 6
innodb_write_io_threads = 6

innodb_file_per_table = ON
innodb_default_row_format = dynamic
#innodb_read_only_compressed = OFF

table_open_cache = 4200
table_definition_cache = 2500

join_buffer_size = 11M
sort_buffer_size = 5M

max_connections = 150
wait_timeout = 900
interactive_timeout = 4000

optimizer_use_condition_selectivity = 1
optimizer_switch = 'rowid_filter=off'

open_files_limit = 102400

log_bin
log_basename = friendicame

My issue: Every time, I restart the db service, it stops with exit code 137 (OOM) and restarts with a crash recovery. I already tried to adjust the log file size, the capacity, ...

I'm constantly using https://github.com/major/MySQLTuner-perl to find any hints about performance issues.

I was currently facing this issue after the last restart, here's the log:

friendicame-db-1  | 2023-02-25 19:10:01+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.11.2+maria~ubu2204 started.                                                                                                                                                    
friendicame-db-1  | 2023-02-25 19:10:02+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'                                                                                                                                                                                  
friendicame-db-1  | 2023-02-25 19:10:02+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.11.2+maria~ubu2204 started.                                                                                                                                                    

friendicame-db-1  | 2023-02-25 19:10:02+00:00 [Note] [Entrypoint]: MariaDB upgrade not required                                                                                                                                                                                             
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] Starting MariaDB 10.11.2-MariaDB-1:10.11.2+maria~ubu2204-log source revision cafba8761af55ae16cc69c9b53a341340a845b36 as process 1                                                                                                         
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] mariadbd: Aria engine: starting recovery                                                                                                                                                                                                   
friendicame-db-1  | tables to flush: 3 2 1 0                                                                                                                                                                                                                                                
friendicame-db-1  |  (0.0 seconds);                                                                                                                                                                                                                                                         
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] mariadbd: Aria engine: recovery done                                                                                                                                                                                                       
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Compressed tables use zlib 1.2.11                                                                                                                                                                                                  
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Using transactional memory                                                                                                                                                                                                         
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Number of transaction pools: 1                                                                                                                                                                                                     
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Using liburing                                                                                                                                                                                                                     
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Initializing buffer pool, total size = 32.000GiB, chunk size = 512.000MiB                                                                                                                                                          
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Completed initialization of buffer pool                                                                                                                                                                                            
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)                                                                                                                                                                        
friendicame-db-1  | 2023-02-25 19:10:02 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=193182549942                                                                                                                                                                           
friendicame-db-1  | 2023-02-25 19:10:17 0 [Note] InnoDB: Read redo log up to LSN=194715677211                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:10:32 0 [Note] InnoDB: Read redo log up to LSN=196074274122                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:10:47 0 [Note] InnoDB: Read redo log up to LSN=197349099172                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:11:04 0 [Note] InnoDB: Read redo log up to LSN=193182549942                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:11:19 0 [Note] InnoDB: Read redo log up to LSN=194967290477                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:11:34 0 [Note] InnoDB: Read redo log up to LSN=196627844278                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:11:49 0 [Note] InnoDB: Read redo log up to LSN=198355460189                                                                                                                                                                                               
friendicame-db-1  | 2023-02-25 19:11:53 0 [Note] InnoDB: Starting final batch to recover 437904 pages from redo log.                                                                                                                                                                        
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: Last binlog file './friendicame-bin.000012', position 10653628                                                                                                                                                                     
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: 128 rollback segments are active.                                                                                                                                                                                                  
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"                                                                                                                                                                                 
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...                                                                                                                                       
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.                                                                                                                                                                                             
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: log sequence number 198552641210; transaction id 368715865                                                                                                                                                                         
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool                                                                                                                                                                          
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] Plugin 'FEEDBACK' is disabled.                                                                                                                                                                                                             
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] Server socket created on IP: '0.0.0.0'.                                                                                                                                                                                                    
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] Server socket created on IP: '::'.                                                                                                                                                                                                         
friendicame-db-1  | 2023-02-25 19:12:02 0 [Note] mariadbd: ready for connections.                                                                                                                                                                                                           
friendicame-db-1  | Version: '10.11.2-MariaDB-1:10.11.2+maria~ubu2204-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
friendicame-db-1  | 2023-02-25 19:12:02 6 [Warning] Aborted connection 6 to db: 'unconnected' user: 'unauthenticated' host: '192.168.32.7' (This connection closed normally without authentication)
friendicame-db-1  | 2023-02-25 19:12:05 0 [Note] InnoDB: Buffer pool(s) load completed at 230225 19:12:05

Do you have any idea what's wrong with my config ?

grooverdan commented 1 year ago

Can you tell how much memory MariaDB is using when it OOMs? mem_reservation: 4g seems low, but maybe I misunderstand what it does.

Do you know which version your troubles began at?

grooverdan commented 1 year ago

Could also try jemalloc as a memory manager: env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

nupplaphil commented 1 year ago

Can you tell how much memory MariaDB is using when it OOMs? mem_reservation: 4g seems low, but maybe I misunderstand what it does.

Do you know which version your troubles began at?

tbh I don't know, I'm always using the latest version of the Docker images.

mem_reservation: Allows you to specify a soft limit smaller than --memory which is activated when Docker detects contention or low memory on the host machine. If you use --memory-reservation, it must be set lower than --memory for it to take precedence. Because it is a soft limit, it does not guarantee that the container doesn’t exceed the limit. https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory

nupplaphil commented 1 year ago

Could also try jemalloc as a memory manager: env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

I now switched, see:

+------------------------+------------------------------------------------------------+
| Variable_name          | Value                                                      |
+------------------------+------------------------------------------------------------+
| version_malloc_library | jemalloc 5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756 |
+------------------------+------------------------------------------------------------+

Shall I wait some time for a restart or what do you suggest?

This is some minutes after switching to jemalloc:

friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] mariadbd (initiated by: unknown): Normal shutdown                                                                                                                                                                                          
friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: FTS optimize thread exiting.                                                                                                                                                                                                       
friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Starting shutdown...                                                                                                                                                                                                               
friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool                                                                                                                                                                            
friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Restricted to 519200 pages due to innodb_buf_pool_dump_pct=25                                                                                                                                                                      
friendicame-db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Buffer pool(s) dump completed at 230226 13:03:17                                                                                                                                                                                   
friendicame-db-1 exited with code 0
friendicame-db-1  | 2023-02-26 13:03:28+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.11.2+maria~ubu2204 started.
friendicame-db-1  | 2023-02-26 13:03:28+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
friendicame-db-1  | 2023-02-26 13:03:28+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.11.2+maria~ubu2204 started.
friendicame-db-1  | 2023-02-26 13:03:29+00:00 [Note] [Entrypoint]: MariaDB upgrade not required
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] Starting MariaDB 10.11.2-MariaDB-1:10.11.2+maria~ubu2204-log source revision cafba8761af55ae16cc69c9b53a341340a845b36 as process 1
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Using transactional memory
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Number of transaction pools: 1
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Using liburing
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Initializing buffer pool, total size = 32.000GiB, chunk size = 512.000MiB
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Completed initialization of buffer pool
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
friendicame-db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=193796878816
friendicame-db-1  | 2023-02-26 13:03:44 0 [Note] InnoDB: Read redo log up to LSN=195420200063
friendicame-db-1  | 2023-02-26 13:03:59 0 [Note] InnoDB: Read redo log up to LSN=197030392720
friendicame-db-1  | 2023-02-26 13:04:14 0 [Note] InnoDB: Read redo log up to LSN=198389002919
friendicame-db-1  | 2023-02-26 13:04:29 0 [Note] InnoDB: Read redo log up to LSN=194182667856
friendicame-db-1  | 2023-02-26 13:04:44 0 [Note] InnoDB: Read redo log up to LSN=196107827853
friendicame-db-1  | 2023-02-26 13:04:59 0 [Note] InnoDB: Read redo log up to LSN=197986456996
friendicame-db-1  | 2023-02-26 13:05:13 0 [Note] InnoDB: Starting final batch to recover 435667 pages from redo log.
friendicame-db-1  | 2023-02-26 13:05:14 0 [Note] InnoDB: To recover: 333526 pages from log
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: Last binlog file './friendicame-bin.000017', position 3081397
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: 128 rollback segments are active.
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: log sequence number 199330293772; transaction id 369778667
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] Plugin 'FEEDBACK' is disabled.
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] Server socket created on IP: '0.0.0.0'.
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] Server socket created on IP: '::'.
friendicame-db-1  | 2023-02-26 13:05:21 0 [Note] mariadbd: ready for connections.
friendicame-db-1  | Version: '10.11.2-MariaDB-1:10.11.2+maria~ubu2204-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
friendicame-db-1  | 2023-02-26 13:05:22 6 [Warning] Aborted connection 6 to db: 'unconnected' user: 'unauthenticated' host: '192.168.32.7' (This connection closed normally without authentication)
friendicame-db-1  | 2023-02-26 13:05:24 0 [Note] InnoDB: Buffer pool(s) load completed at 230226 13:05:24

At least it seems like it's now exiting with code 0. This is the internal memory overview for this period:

grooverdan commented 1 year ago

Thanks for the mem_reservation link. I've got a similar works within MariaDB in development (MDEV-24670). Is there docker logs to know when this contention period is detected.

Shall I wait some time for a restart or what do you suggest?

It looks like maybe docker saw the change and did the restart.

Just leave it running. jemalloc is reportedly better at preventing memory fragmentation. After a while of uptime you'll see if its leaking memory in the same way.

At least it seems like it's now exiting with code 0.

This is interesting. Not so much the 'existing with code' but the messages in the log. A normal stop on the container should look like:

2023-02-26 23:26:43 0 [Note] mariadbd (initiated by: unknown): Normal shutdown
2023-02-26 23:26:43 0 [Note] InnoDB: FTS optimize thread exiting.
2023-02-26 23:26:43 0 [Note] InnoDB: Starting shutdown...
2023-02-26 23:26:43 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2023-02-26 23:26:43 0 [Note] InnoDB: Buffer pool(s) dump completed at 230226 23:26:43
2023-02-26 23:26:43 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-02-26 23:26:43 0 [Note] InnoDB: Shutdown completed; log sequence number 46590; transaction id 15
2023-02-26 23:26:43 0 [Note] mariadbd: Shutdown complete

Notably the last InnoDB message there, and the last shutdown are the bits that would ensure the next startup isn't a crash recovery.

So the lack of messages makes it look like hard kill during shutdown.

The resolution of the 137 error code. From bash(2) an error code above 128 is a indication that the process was killed with a signal. The difference between 128, and 137 is 9, which by kill -9 is SIGKILL. While OOM can trigger this, a docker kill {container} also does a SIGKILL (quite soon after SIGTERM by the looks of the shutdown in the last log).

How are you shutting down your containers? Because a stop mechanism will give them enough time to shutdown cleanly and avoid crash recovery.

nupplaphil commented 1 year ago

I never saw this kind of lines since I started analyzing the crash recovery issue

[...]
2023-02-26 23:26:43 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-02-26 23:26:43 0 [Note] InnoDB: Shutdown completed; log sequence number 46590; transaction id 15
2023-02-26 23:26:43 0 [Note] mariadbd: Shutdown complete

So it seems like Docker is killing the container ..

Normally, I use one of the following commands to start/stop/restart container:

https://docs.docker.com/engine/reference/commandline/compose_restart/ : Restarts all stopped and running services, or the specified services only.
https://docs.docker.com/engine/reference/commandline/compose_stop/ : Stops running containers without removing them. They can be started again with docker compose start.
https://docs.docker.com/engine/reference/commandline/compose_up/ : Builds, (re)creates, starts, and attaches to containers for a service. Unless they are already running, this command also starts any linked services.

I don't use any additional settings, like --force or --wait, ...

Normally, it works fine for every other container.

nupplaphil commented 1 year ago

Shall I wait some time for a restart or what do you suggest?

It looks like maybe docker saw the change and did the restart.

Just to clarify, I restarted the container manually just to check if the jemalloc maybe fixed it :)

grooverdan commented 1 year ago

The docker compose stop/restart lists a 10 second timeout. After which I assume it does a sigkill. The previous logs seem to match exactly to this 10 second timeout.

Quite simply, MariaDB takes longer than 10 seconds to shutdown cleanly. Increase the timeout.

grooverdan commented 1 year ago

And jemalloc probably wasn't necessary. Please confirm even though this is closed.

nupplaphil commented 1 year ago

Confirmed :-)

docker-compose up -d -t 120 was enough

It took 1.5 minutes to shutdown, now everything is working as expected.

Sorry for this "newbie" error ..

PS: I switched back to system malloc, since I don't know what benefits I would have with jemalloic for my setup

grooverdan commented 1 year ago

Good. 1.5 minutes of shutdown is better than 2 minutes of crash recovery (and looks a lot better as "normal operation").

Not that much of a newbie question, or I would have seen it a few days ago. Certainly worth adding to a FAQ when I get time to write one.

jemalloc is worth trying if you see something like a memory leak. Differencing between that an memory fragmentation is hard. On general benefits, not much idea beyond it exists dedicated as an alternate malloc library.

Thanks for choosing MariaDB. Bug reports/feature requests welcome, even if they come off as a newbie error, someone else will hit it, so we'll try to document/prevent it if possible.

grooverdan commented 5 months ago

And have done FAQ now - https://mariadb.com/kb/en/docker-official-image-frequently-asked-questions/#every-mariadb-start-is-a-crash-recovery

MariaDB / mariadb-docker

MariaDB starts crash recovery after every restart #496