Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2k stars 574 forks source link

icinga 2.11 exits without errors on FreeBSD 11.3 #7539

Open nielsk opened 5 years ago

nielsk commented 5 years ago

Describe the bug

After upgrading from icinga 2.10.5 to 2.11 on FreeBSD 11.3-p3, icinga2 daemon -C shows that the configuration is correct, but starts and immediately exits when the api-feature is enabled. It works without the api-feature.

After re-running the api setup, I got it working but it crashed when I tried to send a notification.

output from running truss icinga2 daemon -x debug before 'api setup'

[2019-09-25 08:50:29 +0200] information/DbConnection: 'ido-mysql' started.                                                                                                                                                                    
[2019-09-25 08:50:29 +0200] information/ExternalCommandListener: 'command' started.                                                                                                                                                           
[2019-09-25 08:50:29 +0200] warning/ExternalCommandListener: This feature is DEPRECATED and will be removed in future releases. Check the roadmap at https://github.com/Icinga/icinga2/milestones                                             
Context:                                                                                                                                                                                                                                      
        (0) Activating object 'command' of type 'ExternalCommandListener'                                                                                                                                                                     

[2019-09-25 08:50:29 +0200] information/NotificationComponent: 'notification' started.                                                                                                                                                        
[2019-09-25 08:50:29 +0200] information/CheckerComponent: 'checker' started.                                                                                                                                                                  
[2019-09-25 08:50:29 +0200] information/ConfigItem: Activated all objects.
[2019-09-25 08:50:29 +0200] notice/WorkQueue: Stopped WorkQueue threads for 'DaemonCommand::Run'
nanosleep({ 0.200000000 })                       = 0 (0x0)
wait4(60659,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=60659 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)
sigreturn(0x7fffffffca80)                        ERR#4 'Interrupted system call'
wait4(60659,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 60659 (0xecf3)
[2019-09-25 08:50:29 +0200] notice/cli: Seemless worker (PID 60659) stopped, stopping as well
write(1,"[2019-09-25 08:50:29 +0200] \^[["...,103) = 103 (0x67)
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)
close(11)                                        = 0 (0x0)
_umtx_op(0x8010b2038,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2098,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20e0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20f8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2110,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2140,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2068,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2128,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8054182b8,UMTX_OP_NWAKE_PRIVATE,0x18,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2230,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2170,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2200,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2188,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20c8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21a0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2008,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21d0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2080,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2218,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2050,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20b0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2020,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21b8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x80552b5e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
…
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c0000,4096)                         = 0 (0x0)                                                             
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102033 exited>                                                                                                 
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010be000,4096)                         = 0 (0x0)                                                             
<thread 102025 exited>                                                                                                 
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010bf000,4096)                         = 0 (0x0)                                                             
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102032 exited>                                                                                                                                                                                                               [72/4901]
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c3000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102096 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c5000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102213 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c9000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102241 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c8000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102240 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010ca000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x805483e00,UMTX_OP_WAIT,0x18f63,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102243 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b9000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)                                                                                                                                                                        [18/4901]
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101153 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b4000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 100701 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010bd000,4096)                         = 0 (0x0)
munmap(0x8010bc000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102024 exited>
munmap(0x8010bb000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101278 exited>
<thread 101274 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b8000,4096)                         = 0 (0x0)
<thread 101144 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c7000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c6000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 102239 exited>
_umtx_op(0x805484d00,UMTX_OP_WAIT,0x18f5f,0x0,0x0) = 0 (0x0)
munmap(0x8010ba000,4096)                         = 0 (0x0)
<thread 102236 exited>
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
<thread 101163 exited>
munmap(0x8010b7000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b3000,4096)                         = 0 (0x0)
<thread 100918 exited>
<thread 100489 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c1000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b6000,4096)                         = 0 (0x0)
<thread 102035 exited>
<thread 100805 exited>
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010c2000,4096)                         = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_UNLOCK,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x804020800,UMTX_OP_RW_WRLOCK,0x0,0x0,0x0) = 0 (0x0)
munmap(0x8010b5000,4096)                         = 0 (0x0)
<thread 102036 exited>
<thread 100799 exited>
munmap(0x8010c4000,4096)                         = 0 (0x0)
<thread 102158 exited>
_umtx_op(0x805485c00,UMTX_OP_WAIT,0x18f0e,0x0,0x0) = 0 (0x0)
exit(0x8b)
process exit, rval = 139

crash

  Application version: r2.11.0-1

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.3-RELEASE-p3
  Architecture: amd64

Build information:
  Compiler: Clang 8.0.0
  Build host: ic-11_3-RELEASE-HEAD-job-03

Application information:

General paths:
  Config directory: /usr/local/etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /var/run/icinga2

Old paths (deprecated):
  Installation root: /usr/local
  Sysconf directory: /usr/local/etc
  Run directory (base): /var/run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/local/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid

Error: Function call 'send' failed with error code 32, 'Broken pipe'

***
* This would indicate a runtime problem or configuration error. If you believe this is a bug in Icinga 2
* please submit a bug report at https://github.com/Icinga/icinga2 and include this stack trace as well as any other
* information that might be useful in order to reproduce this problem.
***
quit: No such file or directory.
ptrace: Operation not permitted.
//65707: No such file or directory.

Your Environment

Include as many relevant details about the environment you experienced the problem in

Version 2.7.1
Git Commit f98f988aff19fd797531e4a0555e872ae3155142
PHP Version 7.2.22
cube 1.0.1
doc 2.7.1
iframe 0.0.0
ipl v0.1.1
monitoring 2.7.1
reactbundle v0.4.1
setup 2.7.1
unicorn 1.0.2
x509 1.0.0

additional context

I opened a thread on the community discourse where I might have wrote more: https://community.icinga.com/t/problems-with-upgrading-icinga-2-10-5-to-2-11-on-freebsd/2325

dnsmichi commented 5 years ago

@bsdlme can you confirm that behaviour please? I'm not sure how FreeBSD handles the umbrella process and reloads here. Or maybe it is a problem with boost asio & context on BSD specifically.

bsdlme commented 5 years ago

@nielsk: does dmesg(1) show a SIGBUS error for the icinga2 process?

mat813 commented 5 years ago

I have the same problem, I described it a bit more in FreeBSD #240812. I discovered after running dmesg that icinga2 was diying with a SIGBUS.

dnsmichi commented 5 years ago

Is there a difference if you omit -d during that run?

bsdlme commented 5 years ago

@nielsk is running FreeBSD 11.3 / amd64. @mat813 11.2 / i386 both with the API feature enabled.

I was successfully running 2.11.0 on FreeBSD 12.0 / amd64 with API feature.

So the problematic case seems to be API on 11.x

nielsk commented 5 years ago

Nope. SIGSEGV in my case (at least I see a lot of signal 11 in my dmesg, thus this should be from my experiments getting it to work)

On 25. Sep 2019, at 14:20, Lars E notifications@github.com wrote:

 @nielsk: does dmesg(1) show a SIGBUS error for the icinga2 process?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

dnsmichi commented 5 years ago

@bsdlme which boost versions are provided with 11.2 & 3?

nielsk commented 5 years ago

1.71 is in ports

dnsmichi commented 5 years ago

I'm not a FreeBSD user, what else differs between 11.3 and 12 in terms of compiler versions, cmake, build flags, openssl versions, etc. in specific regard to Icinga dependencies?

bsdlme commented 5 years ago

FreeBSD 11.2 (@mat813): clang 6.0.0, OpenSSL 1.0.2o FreeBSD 11.3 (@nielsk): clang 8.0.0, OpenSSL 1.0.2s FreeBSD 12.0 (@bsdlme): clang 6.0.1, OpenSSL 1.1.1a

11.3 was released after 12.0 that's why it has a newer clang version.

Cmake is not in base but installed from ports. Ports have the same version for all FreeBSD versions. The latest cmake in ports is cmake-3.15.3, probably used by all of us.

CFLAGS are:

-DBOOST_COROUTINES_NO_DEPRECATION_WARNING -DBOOST_FILESYSTEM_NO_DEPRECATED -Ithird-party/nlohmann_json -Ithird-party/utf8cpp/source -I. -Ilib -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing -Qunused-arguments -fcolor-diagnostics -pthread -Winvalid-pch -O2 -pipe  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing -MD -MT 
dnsmichi commented 5 years ago

Thanks. As far as I can see, 11.x is still supported. https://www.freebsd.org/security/#sup

@bsdlme How difficult is it for you to spin up 11.3 and test this?

bsdlme commented 5 years ago

I can create a VM at Azure with 11.3. If you like I can give you the login credentials, so you can play around yourself.

dnsmichi commented 5 years ago

I would but unfortunately I have no time atm. I'm merely interested in the fact if you can reproduce this by yourself, and do the backtrace dance. I don't remember whether FreeBSD has gdb or lldb though.

bsdlme commented 5 years ago

Now I set up a 11.3 amd64 VM and installed 2.11 using packages. I needed to change permissions on /usr/local/etc/icinga2 so that the icinga group has write permissions to it (this has changed in 2.11). After that I set a ticket salt, ran "icinga2 api setup" and "icinga2 feature enable api" and could start Icinga using the rc script. I does not crash for me and I am able to use curl to connect to the API port.

nielsk commented 5 years ago

So, what can I do to debug this further? I had set the directory to write-permissions as well because otherwise it wouldn't start in the first place. I just tried the upgrade again, chowned everything in /usr/local/etc/icinga2 to icinga and get a signal 11. According to truss right after the icinga-satellites and an agent started connecting to icinga 2.11

[2019-09-30 21:30:01 +0200] information/ApiListener: Started new listener on '[::]:5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat1.fqdn' via host 'sat1.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat2.fqdn' via host 'sat2.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat3.fqdn' via host 'sat3.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat4.fqdn' via host 'sat4.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat5.fqdn' via host 'sat5.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'sat6.fqdn' via host 'sat6.fqdn' and port '5665'
[2019-09-30 21:30:01 +0200] information/ApiListener: Reconnecting to endpoint 'icinga-agent1' via host 'icinga-agent1' and port '5665'
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=38116 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)
sigreturn(0x7fffffffcac0)                        ERR#4 'Interrupted system call'
wait4(38116,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 38116 (0x94e4)
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)
close(11)                                        = 0 (0x0)
_umtx_op(0x8010b2020,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2098,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2170,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2140,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b20f8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21d0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2158,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8055c09e0,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2128,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2110,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b21e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
_umtx_op(0x8054182b8,UMTX_OP_NWAKE_PRIVATE,0x18,0x0,0x0) = 0 (0x0)
_umtx_op(0x8010b2080,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
nielsk commented 5 years ago

I tried now switching for icinga2 back to the freebsd-pkg-repo instead of our own and it still crashes with the same output as above

bsdlme commented 5 years ago

You could try by using the sample config and adding more and more of your config and see when it starts to crash.

nielsk commented 5 years ago

It would be easier to set up a new server with an officially supported linux-distribution and migrate the config than doing this. I have to speak with my team about it.

bsdlme commented 5 years ago

Or upgrade to 12.0-RELEASE.

nielsk commented 5 years ago

How? I am using 11.3. You cannot upgrade from 11.3 to 12.0 because of the new zfs-features in 11.3.

On 1. Oct 2019, at 16:32, Lars E notifications@github.com wrote:

 Or upgrade to 12.0-RELEASE.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

dnsmichi commented 5 years ago

I don't have much time atm, one thought is how the Boost libraries are compiled on your system. There could be specific hardening compiler flags which create troubles here, or specific stack guard patches which are wrong in the way how Boost Coroutine and Context work. See my analysis for the Nessus scan crashes in #7431.

Since it always crashes on TLS connection start, this would be the place where I'd start debugging. Maybe also the OpenSSL version/linkage on FreeBSD causes trouble here.

bsdlme commented 5 years ago

How? I am using 11.3. You cannot upgrade from 11.3 to 12.0 because of the new zfs-features in 11.3.

Oh, I see. Then you could upgrade to 12.1-BETA2 or wait for 12.1-RC1 which will be released on Oct, 11. Or install 12.0-RELEASE and migrate the data. Unfortunately I don't have any clue of C++, so I can't debug this any further...

davehayes commented 5 years ago

Chiming in with the same problem FreeBSD 11.3 here. I just upgraded to 2.11.0 from 2.10.5, and now I'm also getting this SIGV (11), but I get the same thing from truss:

[2019-10-13 16:47:09 -0700] notice/JsonRpcConnection: Received 'log::SetLogPosition' message from identity 'teraraid.dream-tech.com'.
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           = 0 (0x0)
wait4(62587,{ SIGNALED,sig=64 },WNOHANG,0x0)     = 0 (0x0)
nanosleep({ 0.200000000 })           ERR#4 'Interrupted system call'
SIGNAL 20 (SIGCHLD) code=CLD_DUMPED pid=62587 uid=183 status=11
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)     = 0 (0x0)
sigreturn(0x7fffffffc6c0)            ERR#4 'Interrupted system call'
wait4(62587,{ SIGNALED,sig=SIGSEGV,cored },WNOHANG,0x0) = 62587 (0xf47b)
[2019-10-13 16:47:12 -0700] notice/cli: Seemless worker (PID 62587) stopped, stopping as well
write(1,"[2019-10-13 16:47:12 -0700] \^[["...,103) = 103 (0x67)
unlink("/var/run/icinga2/icinga2.pid")       = 0 (0x0)
close(11)                    = 0 (0x0)

I'm using clang 8.0.0 but LibreSSL 2.9.2. This might at least point away from SSL being the culprit. I can confirm that this crash is only related to being a master, since one of my satellites is running the exact same build but hasn't crashed yet.

@bsdlme - did you have any satellites connected to your test? I suspect that might be necessary so you can see the crash

dnsmichi commented 5 years ago

LibreSSL is something we don't support as the syscalls/APIs may behave differently. We only test OpenSSL. Is this a thing on FreeBSD to set via the ports package?

I'm not sure how to interpret truss, but given that CLD_DUMPED leads to the real error here, is there a possibility to follow child forks? https://vegdave.wordpress.com/2006/10/23/an-example-on-running-truss/ says so.

It may also help to attach gdb/lldb and follow the fork.

bsdlme commented 5 years ago

You can trace child processes with "truss -f".

davehayes commented 5 years ago

LibreSSL is usually a drop in replacement for OpenSSL. We can set a knob when building packages to use that instead of openssl; I can provide more gory details on request. Note that I do not use the normal ports methodology of make; make install as I build far too many packages for too many people. Instead I use poudriere. It is pretty much the same idea with respect to the knob mentioned above.

LibreSSL works 98% of the time; I've built 100s of packages with LibreSSL that work just fine with it including perl, php, nginx, and icinga2. Specific to icinga2, I have it running just fine with LibreSSL at two different sites for the past two years. That being said, there are a few edge case packages that do not build correctly with LibreSSL and these issues are (to my knowledge) handled by the ports system.

I don't think the issue is the LibreSSL api because other users are using the stock OpenSSL api and having the same crash.

So I just backed out to 2.10.5 because I needed it working. I can make some time to try 2.11 again if you are patient with me. :)

bsdlme commented 5 years ago

I can make some time to try 2.11 again if you are patient with me. :)

Yes please. I'm not able to fix it, @dnsmichi is ENOTIME and we should really try to find the cause.

Thanks in advance!

mat813 commented 4 years ago

So, I updated a 11.2 / i386 box to 12.0, and icinga crashes in the same way :(

dnsmichi commented 4 years ago

Is there a way to generate a core dump, or to see a full crash stack trace? The exception with send would indicate it happens between the communication of the main process & process spawn helper. Maybe the last process is gone for some reason.

davehayes commented 4 years ago

Ok, so I've set up my live (but personal) monitoring system so I can switch back and forth between the crashing binary and the non-crashing one. I now have a core file, a binary file (I do not know if symbols are in it) and a truss -aefH which on FreeBSD means it shows argument strings, environment strings (in hindsight, I should not have done this one lol), and includes the thread ID.

Here's the shared library rundown:

# ldd icinga2.bin 
icinga2.bin:
    libexecinfo.so.1 => /usr/local/lib/libexecinfo.so.1 (0x801256000)
    libboost_context.so.1.71.0 => /usr/local/lib/libboost_context.so.1.71.0 (0x801465000)
    libboost_coroutine.so.1.71.0 => /usr/local/lib/libboost_coroutine.so.1.71.0 (0x801667000)
    libboost_date_time.so.1.71.0 => /usr/local/lib/libboost_date_time.so.1.71.0 (0x80186e000)
    libboost_filesystem.so.1.71.0 => /usr/local/lib/libboost_filesystem.so.1.71.0 (0x801a78000)
    libboost_thread.so.1.71.0 => /usr/local/lib/libboost_thread.so.1.71.0 (0x801c91000)
    libboost_system.so.1.71.0 => /usr/local/lib/libboost_system.so.1.71.0 (0x801ea9000)
    libboost_program_options.so.1.71.0 => /usr/local/lib/libboost_program_options.so.1.71.0 (0x8020aa000)
    libboost_regex.so.1.71.0 => /usr/local/lib/libboost_regex.so.1.71.0 (0x802308000)
    libboost_chrono.so.1.71.0 => /usr/local/lib/libboost_chrono.so.1.71.0 (0x8025b9000)
    libboost_atomic.so.1.71.0 => /usr/local/lib/libboost_atomic.so.1.71.0 (0x8027c1000)
    libssl.so.47 => /usr/local/lib/libssl.so.47 (0x8029c3000)
    libcrypto.so.45 => /usr/local/lib/libcrypto.so.45 (0x802c1f000)
    libedit.so.0 => /usr/local/lib/libedit.so.0 (0x80300f000)
    libncurses.so.8 => /lib/libncurses.so.8 (0x803246000)
    libc++.so.1 => /usr/lib/libc++.so.1 (0x80349b000)
    libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x80376a000)
    libm.so.5 => /lib/libm.so.5 (0x803989000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x803bb9000)
    libthr.so.3 => /lib/libthr.so.3 (0x803dcc000)
    libc.so.7 => /lib/libc.so.7 (0x803ff4000)
    libicudata.so.64 => /usr/local/lib/libicudata.so.64 (0x8043af000)
    libicui18n.so.64 => /usr/local/lib/libicui18n.so.64 (0x804600000)
    libicuuc.so.64 => /usr/local/lib/libicuuc.so.64 (0x804b20000)
    librt.so.1 => /usr/lib/librt.so.1 (0x804f0f000)

Interestingly enough, THIS time I ran it, an error got produced from icinga2:

[2019-10-28 13:24:27 -0700] information/cli: Icinga application loader (version: r2.11.0-1)
[2019-10-28 13:24:27 -0700] information/cli: Closing console log.
critical/Application: Error: Function call 'send' failed with error code 32, 'Broken pipe'

Additional information is available in '/var/log/icinga2/crash/report.1572294280.564512'

That crash report appears to be run by a linux oriented script. I've included it and the binary on my webserver. Truss output on request, since I forgot to sanitize it.

https://www.jetcafe.org/icinga2/icinga2.bin https://www.jetcafe.org/icinga2/icinga2.crashreport

I was able to do an lldb, but I've no idea if this is correct usage. I'm going off of old gdb knowledge and google here:

# lldb icinga2.bin --core icinga2.core
(lldb) target create "icinga2.bin" --core "icinga2.core"
Core file '/tmp/icinga2.core' (x86_64) was loaded.
(lldb) thread backtrace all
* thread #1, name = 'icinga2', stop reason = signal SIGABRT
  * frame #0: 0x00000008040bb9ba libc.so.7`thr_kill + 10
    frame #1: 0x00000008040bb984 libc.so.7`__raise(s=6) at raise.c:52:10
    frame #2: 0x00000008040bb8f9 libc.so.7`abort at abort.c:65:8
    frame #3: 0x00000000004332f7 icinga2.bin`___lldb_unnamed_symbol444$$icinga2.bin + 1127
    frame #4: 0x000000080377e459 libcxxrt.so.1`report_failure(err=<unavailable>, thrown_exception=0x00000008054299c8) at exception.cc:719:5
    frame #5: 0x0000000000467132 icinga2.bin`__cxa_throw + 450
    frame #6: 0x0000000000512104 icinga2.bin`___lldb_unnamed_symbol3994$$icinga2.bin + 52
    frame #7: 0x00000000004cea53 icinga2.bin`___lldb_unnamed_symbol2486$$icinga2.bin + 115
    frame #8: 0x00000000004bc2ca icinga2.bin`___lldb_unnamed_symbol1896$$icinga2.bin + 5498
    frame #9: 0x0000000000482787 icinga2.bin`___lldb_unnamed_symbol1392$$icinga2.bin + 423
    frame #10: 0x000000000041ac1c icinga2.bin`___lldb_unnamed_symbol5$$icinga2.bin + 13436
    frame #11: 0x000000000041773a icinga2.bin`___lldb_unnamed_symbol4$$icinga2.bin + 202
    frame #12: 0x000000000041749d icinga2.bin`___lldb_unnamed_symbol1$$icinga2.bin + 141

I hope this has the information you seek. Feel free to request more detailed information and I will attempt to turn it around as quick as I can (which may be glacial). Thanks in advance for looking at this.

nielsk commented 4 years ago

Are there any news yet? Recently my old install broke because boost-libs got updated and my old package wasn't compiled against it. It really seems that icinga2 breaks the moment a satellite tries to connect.

phiten commented 4 years ago

Are there any news yet? Recently my old install broke because boost-libs got updated and my old package wasn't compiled against it. It really seems that icinga2 breaks the moment a satellite tries to connect.

Both of our installations on Jessie broke with the same behaviour after upgrading to r2.11.2-1. Absolutely no errors on both machines in the HA-Cluster.

davehayes commented 4 years ago

One more datapoint here. I recently upgraded a satellite to 2.11.2_1 (from recent 2020Q1 quarterly). I had to restart the master node (which is at 2.10), but it works. I think this bug only happens on a master node.

nielsk commented 4 years ago

I had the problem that the master node dies when a satellite tries to connect. When an agent is checked through a 2.11-satellite but the master is 2.10.5 the check results weren’t handed over to the master node (I could see this only because the last check date didn’t change…great when you see after three days that your host wasn’t checked for several days) and I have one agent with 2.11 where it works at the agent even though satellites and masters are only 2.10.5.

On 21. Jan 2020, at 01:46, Dave Hayes notifications@github.com wrote:

 One more datapoint here. I recently upgraded a satellite to 2.11.2_1 (from recent 2020Q1 quarterly). I had to restart the master node (which is at 2.10), but it works. I think this bug only happens on a master node.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mat813 commented 4 years ago

I do not have a master node running 32 bits FreeBSD, but it happens on all the sattellites I have that are running on i386. I will try 2.11.2_1.

davehayes commented 4 years ago

Just to be clear, all my sites are 64 bit FreeBSD.

buzzdeee commented 4 years ago

I see much the same on OpenBSD i386, a quite current node runs icinga2 2.11.3, some bit older node runs 2.11.2.

I've an OpenBSD arm64 running 2.11.2, and a couple of amd64 running 2.11.3 and 2.11.2 without issues. My master runs 2.11.2 on amd64.

bsdlme commented 4 years ago

This should be fixed now when all nodes run 2.11.3. I just updated the FreeBSD port to 2.11.3, so please test it. :)

lippserd commented 4 years ago

Thanks for the work @bsdlme 👍 Looking forward to the test feedback here.

nielsk commented 4 years ago

I will probably do the update next week. The last time I tried it I spent two hours downgrading packages, thus I have to set a bit of time aside.

nielsk commented 4 years ago

Thanks to boot environments I decided to do the test today. The problem still exists.

[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/raido.conf
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/roehl.conf
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/tis.conf.DISABLED
[2020-04-30 10:15:50 +0200] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/icinga2-global-templates//_etc/users/users/unverricht.conf
[2020-04-30 10:15:50 +0200] notice/ApiListener: Updated meta data for cluster config sync. Checksum: '/var/lib/icinga2/api/zones/icinga2-global-templates/.checksums', timestamp: '/var/lib/icinga2/api/zones/icinga2-global-templa
tes/.timestamp', auth: '/var/lib/icinga2/api/zones/icinga2-global-templates/.authoritative'.                     
[2020-04-30 10:15:50 +0200] information/ApiListener: Started new listener on '[::]:5665'                         
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Endpoint 'ic.ops.eusc.inter.net' because that's us.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'shaker.ops.eusc.inter.net' via host '195.201.164.235' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'gin.inx.de' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'ic2-cloud-probe' via host 'ic2-cloud-probe' and port '5665'
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'saltmaster.interdotnet.de' via host 'saltmaster.interdotnet.de' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'kitsune.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'jake.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'striper.psychedelicpirate.com' via host 'striper.psychedelicpirate.com' and port '5665'
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'kham.psychedelicpirate.com' via host 'kham.psychedelicpirate.com' and port '5665'
[2020-04-30 10:15:50 +0200] debug/ApiListener: Not connecting to Zone 'sally.psychedelicpirate.com' because it's not in the same zone, a parent or a child zone.
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'shopsatellite.ber.inx.de' via host 'shopsatellite.ber.inx.de' and port '5665'
[2020-04-30 10:15:50 +0200] notice/ApiListener: Current zone master: ic.ops.eusc.inter.net                       
[2020-04-30 10:15:50 +0200] information/ApiListener: Reconnecting to endpoint 'n113h071.cloud.de.inter.net' via host '213.73.113.71' and port '5665'
[2020-04-30 10:15:50 +0200] notice/ApiListener: Connected endpoints:                                             
nanosleep({ 0.200000000 })                       ERR#4 'Interrupted system call'                                 
SIGNAL 20 (SIGCHLD) code=CLD_KILLED pid=7238 uid=183 status=11                                                   
sigprocmask(SIG_SETMASK,{ SIGCHLD },0x0)         = 0 (0x0)                                                       
sigreturn(0x7fffffffca80)                        ERR#4 'Interrupted system call'                                 
wait4(7238,{ SIGNALED,sig=SIGSEGV },WNOHANG,0x0) = 7238 (0x1c46)                                                 
[2020-04-30 10:15:50 +0200] notice/cli: Seemless worker (PID 7238) stopped, stopping as well                     
write(1,"[2020-04-30 10:15:50 +0200] \^[["...,102) = 102 (0x66)                                                  
unlink("/var/run/icinga2/icinga2.pid")           = 0 (0x0)                                                       
close(11)                                        = 0 (0x0)                                                       
_umtx_op(0x8010d0050,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d00c8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d0188,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d01a0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)                                            
_umtx_op(0x8010d01e8,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)         
...                                   
icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: r2.11.3-1)

Copyright (c) 2012-2020 Icinga GmbH (https://icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

System information:
  Platform: Unknown
  Platform version: Unknown
  Kernel: FreeBSD
  Kernel version: 11.3-RELEASE-p6
  Architecture: amd64

Build information:
  Compiler: Clang 8.0.0
  Build host: ic-11_3-RELEASE-HEAD-job-03

Application information:

General paths:
  Config directory: /usr/local/etc/icinga2
  Data directory: /var/lib/icinga2
  Log directory: /var/log/icinga2
  Cache directory: /var/cache/icinga2
  Spool directory: /var/spool/icinga2
  Run directory: /var/run/icinga2

Old paths (deprecated):
  Installation root: /usr/local
  Sysconf directory: /usr/local/etc
  Run directory (base): /var/run
  Local state directory: /var

Internal paths:
  Package data directory: /usr/local/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /var/run/icinga2/icinga2.pid
lippserd commented 4 years ago

Do the endpoints run 2.11.3 as well?

nielsk commented 4 years ago

no. They run icinga 2.10.5. According to the compatability list it is master > satellite > client When I update the satellites to 2.11 and the master is at 2.10 then check-delivery won't work anymore. Thus I had to downgrade my satellites (interestingly clients can be at 2.10) when icinga 2.11 was released and I updated the satellites but suddenly I got no check results anymore...

nielsk commented 4 years ago

master (2.11) >= satellite (2.10) >= agent (2.9) https://icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#versions-and-upgrade

lippserd commented 4 years ago

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

nielsk commented 4 years ago

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

Where did he write this? What is the reasoning? I ask because I do not want to blindly run into this test because it means loads of more work if I update everywhere, the crash happens again and I have to downgrade everywhere again.

lippserd commented 4 years ago

But we also had a follow-up bug with our big JSON-RPC issue which could have an influence here too. Is there any chance that you have nodes that initiate connections to other nodes, but those nodes don't have endpoint configuration for them?

lippserd commented 4 years ago

Refs #7532

nielsk commented 4 years ago

But we also had a follow-up bug with our big JSON-RPC issue which could have an influence here too. Is there any chance that you have nodes that initiate connections to other nodes, but those nodes don't have endpoint configuration for them?

I have one master, this master has configurations for its satellites, the satellites have configurations for the master. I have one pair of satellites that actually connect to endpoints with icinga2 instead of nrpe but there all endpoints are configured for the satellites as well.

lippserd commented 4 years ago

As far as I understand @bsdlme correctly, all nodes must run 2.11.3.

Where did he write this? What is the reasoning? I ask because I do not want to blindly run into this test because it means loads of more work if I update everywhere, the crash happens again and I have to downgrade everywhere again.

I think that this is strongly related to the JSON-RPC or follow-up bug. So upgrading just your nodes that crash, should be sufficient.

@bsdlme Is there any chance to build some sort of snapshot packages? We could prepare a branch with the 2.11.3 as base and some patches on top.