Closed julianbrost closed 2 years ago
I'm able to reproduce this somewhat reliably with the following steps:
XLEN icinga:history:stream:state
to be about 1000, should contain enough hard state changes so that it writes to the SLA table)retention:
sla-days: 400
interval: 1ns
@julianbrost Can you please run SHOW ENGINE INNODB STATUS
to get more information about the deadlock?
MariaDB [icingadb]> SHOW ENGINE INNODB STATUS \G
*************************** 1. row ***************************
Type: InnoDB
Name:
Status:
=====================================
2022-06-13 12:12:26 0x7fb92c4e5700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 6 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 0 srv_active, 0 srv_shutdown, 1754 srv_idle
srv_master_thread log flush and writes: 1754
----------
SEMAPHORES
----------
------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-06-13 12:12:25 0x7fb918152700
*** (1) TRANSACTION:
TRANSACTION 99495, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 41
MariaDB thread id 47, OS thread handle 140432949716736, query id 297591 172.24.0.1 icingadb-test Update
INSERT INTO "sla_history_state" ("environment_id", "id", "event_time", "previous_hard_state", "endpoint_id", "object_type", "host_id", "service_id", "hard_state") VALUES (?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,?,?,?,?,?),(?,?,?,?,
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 63 page no 5 n bits 128 index PRIMARY of table "icingadb"."sla_history_state" trx id 99495 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 008806e58c5b9e40660e0db068ac610055207d58; asc [ @f h a U }X;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex fce7d8b6f0543ae57219c8df5dfbf18d481a49ec; asc T: r ] H I ;;
8: len 8; hex 000001815ce0cf20; asc \ ;;
9: len 1; hex 00; asc ;;
10: len 1; hex 02; asc ;;
*** CONFLICTING WITH:
RECORD LOCKS space id 63 page no 5 n bits 128 index PRIMARY of table "icingadb"."sla_history_state" trx id 99497 lock_mode X
Record lock, heap no 2 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 0027dc5609266f29318a613799f518f99cbe13ce; asc ' V &o)1 a7 ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex 9fcfc38de65faf92c2db15f571ff58a747f86d0f; asc _ q X G m ;;
8: len 8; hex 000001815ce0b917; asc \ ;;
9: len 1; hex 00; asc ;;
10: len 1; hex 02; asc ;;
Record lock, heap no 3 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 008806e58c5b9e40660e0db068ac610055207d58; asc [ @f h a U }X;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex fce7d8b6f0543ae57219c8df5dfbf18d481a49ec; asc T: r ] H I ;;
8: len 8; hex 000001815ce0cf20; asc \ ;;
9: len 1; hex 00; asc ;;
10: len 1; hex 02; asc ;;
Record lock, heap no 4 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 01854f209355f7974779acc437c3f2d8879258d7; asc O U Gy 7 X ;;
1: len 6; hex 000000000000; asc ;;
2: len 7; hex 80000000000000; asc ;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex 12ec27868959d05a000f5942bbcc5a66f49f7f6f; asc ' Y Z YB Zf o;;
8: len 8; hex 000001815ce0cab5; asc \ ;;
9: len 1; hex 00; asc ;;
10: len 1; hex 02; asc ;;
*** (2) TRANSACTION:
TRANSACTION 99497, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 4 row lock(s)
MariaDB thread id 45, OS thread handle 140433288701696, query id 297631 172.24.0.1 icingadb-test init for update
DELETE FROM sla_history_state WHERE environment_id = ? AND event_time < ?
ORDER BY event_time LIMIT 5000
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 63 page no 5 n bits 128 index PRIMARY of table "icingadb"."sla_history_state" trx id 99497 lock_mode X waiting
Record lock, heap no 59 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 04b6a7f3357101a31a7bb8fd366c73a421b012fa; asc 5q { 6ls ! ;;
1: len 6; hex 0000000184a7; asc ;;
2: len 7; hex 9e000001530523; asc S #;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex 16017521506035ace937bfed360a9186d9490f5c; asc u!P`5 7 6 I \;;
8: len 8; hex 000001815ce1da81; asc \ ;;
9: len 1; hex 02; asc ;;
10: len 1; hex 00; asc ;;
*** CONFLICTING WITH:
RECORD LOCKS space id 63 page no 5 n bits 128 index PRIMARY of table "icingadb"."sla_history_state" trx id 99495 lock_mode X locks rec but not gap
Record lock, heap no 59 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 20; hex 04b6a7f3357101a31a7bb8fd366c73a421b012fa; asc 5q { 6ls ! ;;
1: len 6; hex 0000000184a7; asc ;;
2: len 7; hex 9e000001530523; asc S #;;
3: len 20; hex 6cc4013ece4ffdbb275a4a763460507d17203146; asc l > O 'ZJv4`P} 1F;;
4: len 20; hex 2bea57085609682477bde443bff3e4081558cd15; asc + W V h$w C X ;;
5: len 1; hex 02; asc ;;
6: len 20; hex 1b74f41cfbd98af27a2c59e544c54b30814837aa; asc t z,Y D K0 H7 ;;
7: len 20; hex 16017521506035ace937bfed360a9186d9490f5c; asc u!P`5 7 6 I \;;
8: len 8; hex 000001815ce1da81; asc \ ;;
9: len 1; hex 02; asc ;;
10: len 1; hex 00; asc ;;
*** WE ROLL BACK TRANSACTION (0)
------------
TRANSACTIONS
------------
Trx id counter 99523
Purge done for trx's n:o < 99521 undo n:o < 255 state: running
History list length 32
LIST OF TRANSACTIONS FOR EACH SESSION:
--------
FILE I/O
--------
Pending flushes (fsync) log: 0; buffer pool: 0
172 OS file reads, 291 OS file writes, 1289 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 3.33 writes/s, 45.66 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 18737471
Log flushed up to 18737471
Pages flushed up to 42197
Last checkpoint at 42185
0 pending log flushes, 0 pending chkp writes
293 log i/o's done, 3.33 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 167772160
Dictionary memory allocated 994744
Buffer pool size 8112
Free buffers 6615
Database pages 1497
Old database pages 532
Modified db pages 1357
Percent of dirty pages(LRU & free pages): 16.726
Max dirty pages percent: 90.000
Pending reads 0
Pending writes: LRU 0, flush list 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 158, created 1339, written 0
0.00 reads/s, 74.49 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 1497, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 read views open inside InnoDB
Process ID=0, Main thread ID=0, state: sleeping
Number of rows inserted 26572, updated 526, deleted 1, read 8044688
2468.92 inserts/s, 86.32 updates/s, 0.00 deletes/s, 50662.72 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
1 row in set (0.000 sec)
While testing #484 (I was using 2532d06e0ae92547f1019a49e3fc47558a9f53c3 to be precise), I was randomly greeted by the following crash. As that PR does nothing with retention at all, I presume the same may happen on the current master as well.
Not sure why that happened or how to reproduce it, but that shouldn't bring down the process for sure.