Open Crited opened 1 year ago
Looks like it's a problem in icinga::RedisConnection::FireAndForgetQueries()
:
(gdb) info line *0x0000000000f6146c
Line 59 of "/usr/include/boost/asio/executor_work_guard.hpp"
starts at address 0xf61463 <icinga::RedisConnection::FireAndForgetQueries(std::vector<std::vector<icinga::String, std::allocator<icinga::String> >, std::allocator<std::vector<icinga::String, std::allocator<icinga::String> > > >, icinga::RedisConnection::QueryPriority, icinga::RedisConnection::QueryAffects)+307>
and ends at 0xf61490 <icinga::RedisConnection::FireAndForgetQueries(std::vector<std::vector<icinga::String, std::allocator<icinga::String> >, std::allocator<std::vector<icinga::String, std::allocator<icinga::String> > > >, icinga::RedisConnection::QueryPriority, icinga::RedisConnection::QueryAffects)+352>.
Can you please share what was logged regarding Icinga DB, i.e. something like grep -i icingadb /var/log/icinga2/icinga2.log
?
Sadly, the logs are not really helpful in that regard:
[2023-08-23 11:54:27 +0200] information/IcingaDB: 'icingadb' started.
[2023-08-23 11:54:27 +0200] information/IcingaDB: Trying to connect to Redis server (async) on host '10.0.64.81:6380'
[2023-08-23 11:54:27 +0200] information/IcingaDB: Connected to Redis server
[2023-08-23 11:54:27 +0200] information/IcingaDB: Starting initial config/status dump
[2023-08-23 11:54:30 +0200] information/IcingaDB: Initial config/status dump finished in 3.07589 seconds.
[2023-08-23 11:54:37 +0200] information/IcingaDB: Pending queries: 0 (Input: 5109/s; Output: 5210/s)
[2023-08-23 11:58:59 +0200] information/IcingaDB: Flushing history data buffer to Redis.
[2023-08-23 11:58:59 +0200] information/IcingaDB: 'icingadb' stopped.
[2023-08-23 11:59:05 +0200] information/IcingaDB: 'icingadb' started.
[2023-08-23 11:59:05 +0200] information/IcingaDB: Trying to connect to Redis server (async) on host '10.0.64.81:6380'
[2023-08-23 11:59:05 +0200] information/IcingaDB: Connected to Redis server
[2023-08-23 11:59:05 +0200] information/IcingaDB: Starting initial config/status dump
[2023-08-23 11:59:08 +0200] information/IcingaDB: Initial config/status dump finished in 2.62474 seconds.
[2023-08-23 11:59:15 +0200] information/IcingaDB: Pending queries: 0 (Input: 1290/s; Output: 1298/s)
[2023-08-23 12:01:02 +0200] information/IcingaDB: Flushing history data buffer to Redis.
[2023-08-23 12:01:02 +0200] information/IcingaDB: 'icingadb' stopped.
[2023-08-23 12:01:08 +0200] information/IcingaDB: 'icingadb' started.
[2023-08-23 12:01:08 +0200] information/IcingaDB: Trying to connect to Redis server (async) on host '10.0.64.81:6380'
[2023-08-23 12:01:08 +0200] information/IcingaDB: Connected to Redis server
[2023-08-23 12:01:09 +0200] information/IcingaDB: Starting initial config/status dump
[2023-08-23 12:01:11 +0200] information/IcingaDB: Initial config/status dump finished in 2.60669 seconds.
[2023-08-23 12:01:18 +0200] information/IcingaDB: Pending queries: 0 (Input: 1283/s; Output: 1283/s)
[2023-08-23 12:06:18 +0200] information/IcingaDB: Pending queries: 0 (Input: 641/s; Output: 641/s)
[2023-08-23 12:11:18 +0200] information/IcingaDB: Pending queries: 0 (Input: 629/s; Output: 629/s)
Looks like it's a problem in
icinga::RedisConnection::FireAndForgetQueries()
:(gdb) info line *0x0000000000f6146c Line 59 of "/usr/include/boost/asio/executor_work_guard.hpp" starts at address 0xf61463 <icinga::RedisConnection::FireAndForgetQueries(std::vector<std::vector<icinga::String, std::allocator<icinga::String> >, std::allocator<std::vector<icinga::String, std::allocator<icinga::String> > > >, icinga::RedisConnection::QueryPriority, icinga::RedisConnection::QueryAffects)+307> and ends at 0xf61490 <icinga::RedisConnection::FireAndForgetQueries(std::vector<std::vector<icinga::String, std::allocator<icinga::String> >, std::allocator<std::vector<icinga::String, std::allocator<icinga::String> > > >, icinga::RedisConnection::QueryPriority, icinga::RedisConnection::QueryAffects)+352>.
Are you sure?
$ git diff v2.13.8..v2.14.0 -- lib/icingadb/redisconnection.cpp
$
Neither git diff v2.13.8..v2.14.0 -- lib/icingadb
shows me anything suspicious.
Anyway I'm afraid we can't do much without a core dump: https://icinga.com/docs/icinga-2/latest/doc/21-development/#core-dump
Are you sure?
I mean that's what GDB says for icinga2-bin-debuginfo-2.14.0 from our RHEL 8 repo.
Double-checking:
# objdump -dC --start-address 0xf61442 --stop-address 0xf61481 /usr/lib64/icinga2/sbin/icinga2
/usr/lib64/icinga2/sbin/icinga2: file format elf64-x86-64
Disassembly of section .text:
0000000000f61442 <icinga::RedisConnection::FireAndForgetQueries(std::vector<std::vector<icinga::String, std::allocator<icinga::String> >, std::allocator<std::vector<icinga::String, std::allocator<icinga::String> > > >, icinga::RedisConnection::QueryPriority, icinga::RedisConnection::QueryAffects)@@Base+0x112>:
f61442: 00 48 89 add %cl,-0x77(%rax)
f61445: 43 10 48 c7 rex.XB adc %cl,-0x39(%r8)
f61449: 43 18 00 rex.XB sbb %al,(%r8)
f6144c: 00 00 add %al,(%rax)
f6144e: 00 0f add %cl,(%rdi)
f61450: ae scas %es:(%rdi),%al
f61451: f0 f0 48 83 45 00 01 lock lock addq $0x1,0x0(%rbp)
f61458: e8 83 37 a6 ff callq 9c4be0 <icinga::Utility::GetTime()@@Base>
f6145d: f0 48 83 45 00 01 lock addq $0x1,0x0(%rbp)
f61463: 0f b6 44 24 0c movzbl 0xc(%rsp),%eax
f61468: 48 8b 0c 24 mov (%rsp),%rcx
f6146c: 4c 8b a1 b0 01 00 00 mov 0x1b0(%rcx),%r12
f61473: 4c 8d a9 b8 01 00 00 lea 0x1b8(%rcx),%r13
f6147a: 88 44 24 28 mov %al,0x28(%rsp)
f6147e: 48 rex.W
f6147f: 8b .byte 0x8b
f61480: 84 .byte 0x84
Yup, matches the binary code excerpt logged by the kernel:
Aug 23 11:07:00 **** kernel: Code: 00 48 89 43 10 48 c7 43 18 00 00 00 00 0f ae f0 f0 48 83 45 00 01 e8 83 37 a6 ff f0 48 83 45 00 01 0f b6 44 24 0c 48 8b 0c 24 <4c> 8b a1 b0 01 00 00 4c 8d a9 b8 01 00 00 88 44 24 28 48 8b 84 24
So pretty sure, I'd say.
Describe the bug
Icinga 2 crashes immediatly after start/restart.
After upgrading Icinga 2 from 2.13.7 to 2.14.0 the configuration master ran into a segfault.
To Reproduce
Unknown. First occurence of the error after deploying new configuration -> a reload of the service.
Expected behavior
Icinga 2 should not crash.
Your Environment
Include as many relevant details about the environment you experienced the problem in
icinga2 --version
): 2.14.0icinga2 feature list
):Additional context
Worth mentioning:
As well as the segfault itself.
We've attempted to produce a core dump, but were not able to. Due to the fact that the system is in production the environment was downgraded to 2.13.8.