Closed bengelberth-shadowsoft closed 6 years ago
Confirmed, even without a running ES.
mbmif /usr/local/icinga2/etc/icinga2/tests (master *) # cat check_es_crash
#!/bin/bash
echo "Output | crash=0cs"
exit 0
mbmif /usr/local/icinga2/etc/icinga2/tests (master *) # cat es.conf
object CheckCommand "es-crash" {
command = [ SysconfDir + "/icinga2/tests/check_es_crash" ]
}
object Host "es-crash" {
check_command = "es-crash"
check_interval = 1s
}
[2018-04-03 14:33:09 +0200] warning/ElasticsearchWriter: Ignoring invalid perfdata value: 'crash=0cs' for object 'es-crash'.
Context:
(0) Elasticwriter processing check result for 'es-crash'
Assertion failed: (px != 0), function operator->, file /usr/local/include/boost/smart_ptr/intrusive_ptr.hpp, line 199.
Process 42418 stopped
* thread #12, stop reason = signal SIGABRT
frame #0: 0x00007fff5a6eae3e libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
-> 0x7fff5a6eae3e <+10>: jae 0x7fff5a6eae48 ; <+20>
0x7fff5a6eae40 <+12>: movq %rax, %rdi
0x7fff5a6eae43 <+15>: jmp 0x7fff5a6e20b8 ; cerror_nocancel
0x7fff5a6eae48 <+20>: retq
Target 0: (icinga2) stopped.
(lldb) up
frame #1: 0x00007fff5a829150 libsystem_pthread.dylib`pthread_kill + 333
libsystem_pthread.dylib`pthread_kill:
0x7fff5a829150 <+333>: movl %eax, %r15d
0x7fff5a829153 <+336>: cmpl $-0x1, %r15d
0x7fff5a829157 <+340>: jne 0x7fff5a829161 ; <+350>
0x7fff5a829159 <+342>: callq 0x7fff5a82c19c ; symbol stub for: __error
(lldb)
frame #2: 0x00007fff5a647312 libsystem_c.dylib`abort + 127
libsystem_c.dylib`abort:
0x7fff5a647312 <+127>: movl $0x2710, %edi ; imm = 0x2710
0x7fff5a647317 <+132>: callq 0x7fff5a6198e4 ; usleep$NOCANCEL
0x7fff5a64731c <+137>: callq 0x7fff5a647321 ; __abort
libsystem_c.dylib`__abort:
0x7fff5a647321 <+0>: cmpq $0x0, 0x38f78e1f(%rip) ; gCRAnnotations + 7
(lldb)
frame #3: 0x00007fff5a60f368 libsystem_c.dylib`__assert_rtn + 320
libsystem_c.dylib`basename_r:
0x7fff5a60f368 <+0>: pushq %rbp
0x7fff5a60f369 <+1>: movq %rsp, %rbp
0x7fff5a60f36c <+4>: pushq %r15
0x7fff5a60f36e <+6>: pushq %r14
(lldb)
frame #4: 0x0000000100a28ce9 icinga2`boost::intrusive_ptr<icinga::PerfdataValue>::operator->(this=0x000070000430c800) const at intrusive_ptr.hpp:199
196
197 T * operator->() const BOOST_SP_NOEXCEPT_WITH_ASSERT
198 {
-> 199 BOOST_ASSERT( px != 0 );
200 return px;
201 }
202
(lldb)
frame #5: 0x0000000100e67a63 icinga2`icinga::ElasticsearchWriter::AddCheckResult(this=0x0000000104806c00, fields=0x000070000430d060, checkable=0x000000010559e330, cr=0x000000010559e338) at elasticsearchwriter.cpp:151
148 }
149 }
150
-> 151 String escapedKey = pdv->GetLabel();
152 boost::replace_all(escapedKey, " ", "_");
153 boost::replace_all(escapedKey, ".", "_");
154 boost::replace_all(escapedKey, "\\", "_");
(lldb) p pdv
(Ptr) $0 = {
px = 0x0000000000000000
}
Performance data that contains an invalid unit of measure is causing a segmentation fault with the Elasticsearch feature when enable_send_perfdata is true.
Expected Behavior
I would expect Icinga2 to issue an error to the icinga2 log file and continue to operate with out a Segmentation fault when incorrect units are found in perfdata.
Current Behavior
When a check returns incorrectly formatted performance data it can cause a Segmentation fault with the Elasticsearch feature on and enable_send_perfdata is true
Possible Solution
Steps to Reproduce (for bugs)
This can be reproduced with the following configuration. I also saw this with checks that received an error back and stuck a string in place of a number for performance data. It does not require a valid elasticsearch connection. Testing was performed before standing up an elasticsearch ingest process.
3.elasticsearch.conf
4.
Context
No all third party check commands always produce valid performance data. When they return incorrectly formatted data Icinga2 crashes.
Your Environment
icinga2 --version
):icinga2 feature list
):icinga2 daemon -C
):zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodesThis can be reproduced on a single master.