Closed icinga-migration closed 8 years ago
Updated by mfriedrich on 2016-02-12 09:30:59 +00:00
Which OS are you on?
Updated by seferovic on 2016-02-12 11:26:06 +00:00
RHEL7.2
Updated by seferovic on 2016-02-15 14:57:58 +00:00
The agent that is restarting is running on Red Hat Enterprise Linux Server release 6.7 (Santiago). Master server is running RHEL7.2 Sorry for wrong information.
Updated by mfriedrich on 2016-02-24 20:03:52 +00:00
Does it still happen with 2.4.3?
Updated by haasn on 2016-02-26 04:59:35 +00:00
Same problem here.
It affects versions 2.4.2 and 2.4.3 on the satellite. The only solution I have found is to downgrade to 2.4.1 on the satellite, which works fine.
Updated by mfriedrich on 2016-02-26 08:59:52 +00:00
I wasn't able to reproduce it thus far.
Can you please provide the following information:
Updated by mfriedrich on 2016-02-26 09:00:10 +00:00
Updated by ricardo on 2016-03-04 12:24:14 +00:00
Hi,
having the same problem. All Icinga instances are permanently restarting.
There seems to be a problem in transfering zone config data from master to Checker. I'm seeing corrupted config files on the checker node.
Updated by arcade on 2016-03-04 12:27:09 +00:00
I think i've the same here: https://dev.icinga.org/issues/11304
Updated by mfriedrich on 2016-03-04 13:07:20 +00:00
ricardo wrote:
Hi,
having the same problem. All Icinga instances are permanently restarting.
There seems to be a problem in transfering zone config data from master to Checker. I'm seeing corrupted config files on the checker node.
Can you please post an example (how it looks like on the master and on the client)?
Updated by mfriedrich on 2016-03-04 13:07:40 +00:00
Updated by ricardo on 2016-03-04 14:40:10 +00:00
I'm sorry. I was stupid and had different zones.conf files across my instances.
After i fixed the zones.conf on all nodes the problem was.
basically there were 2 master instances with different config pushing the the satellites.
my problem is solved for now.
Updated by mfriedrich on 2016-03-04 15:54:21 +00:00
Updated by seferovic on 2016-03-07 10:11:33 +00:00
Hi! Sorry for the late feedback.. I just tried to install the latest stable version on RHEL7 and I run into the same problem. Additional info - I configured the satellite manually for the PKI.
zones.conf on the master
object Endpoint "master.monitoring" { }
object Zone "master" { // This is the local master zone = "master" endpoints = [ "master.monitoring" ] } object Zone "global-templates" { global = true }
zones.d on the master
ls -alR zones.d* zones.d: total 8 drwxr-xr-x. 3 icinga icinga 42 Jan 5 13:31 . drwxr-xr-x. 9 root icinga 4096 Jan 19 17:34 .. drwxr-xr-x. 2 icinga icinga 38 Jan 12 12:58 global-templates -rw-r--r--. 1 icinga icinga 133 Nov 26 13:37 README
zones.d/global-templates: total 4 drwxr-xr-x. 2 icinga icinga 38 Jan 12 12:58 . drwxr-xr-x. 3 icinga icinga 42 Jan 5 13:31 .. -rw-r--r--. 1 root root 569 Jan 12 12:58 command_check_uptime.conf
zone definition for the satellite on the master
object Host "s129" { import "generic-host"
address = "x.129"
... vars.remote_client = "s129" }
object Endpoint "s129" { }
object Zone "s129" { endpoints = [ "s129" ] parent = "master" }
/var/lib/icinga/api/zones on the satellite -> /var/lib/icinga2/api/zones
ls -alR /var/lib/icinga2/api/zones /var/lib/icinga2/api/zones: total 12 drwxr-x---. 3 icinga icinga 4096 Mar 7 10:14 . drwxr-x---. 5 icinga icinga 4096 Mar 7 09:50 .. drwx------. 3 icinga icinga 4096 Mar 7 10:25 global-templates
/var/lib/icinga2/api/zones/global-templates: total 16 drwx------. 3 icinga icinga 4096 Mar 7 10:25 . drwxr-x---. 3 icinga icinga 4096 Mar 7 10:14 .. drwxr-xr-x. 2 icinga icinga 4096 Mar 7 10:14 _etc -rw-r--r--. 1 icinga icinga 17 Mar 7 10:25 .timestamp
/var/lib/icinga2/api/zones/global-templates/_etc: total 12 drwxr-xr-x. 2 icinga icinga 4096 Mar 7 10:14 . drwx------. 3 icinga icinga 4096 Mar 7 10:25 .. -rw-r--r--. 1 icinga icinga 569 Mar 7 10:14 command_check_uptime.conf
content of /var/lib/icinga2/modified-attributes.conf on both the master and the satellite ... BOTH ARE EMPTY !!
[master] root@master:/etc/icinga2/conf.d/hosts # ls -al /var/lib/icinga2/modified-attributes.conf
-rw-r--r--. 1 icinga icinga 0 Mar 7 10:45 /var/lib/icinga2/modified-attributes.conf
[client] root@s129:~ # ls -al /var/lib/icinga2/modified-attributes.conf
-rw-r--r--. 1 icinga icinga 0 Mar 7 10:25 /var/lib/icinga2/modified-attributes.conf
debug from both the master and the client (remove the IdoMysql lines, not interested in them, e.g. less debug.log | grep -v IdoMysql)
CLIENT
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344483
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344485
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344488
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344491
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344495
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344497
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344500
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344503
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344505
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344506
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344508
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344511
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344514
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344516
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344519
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344522
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344524
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344527
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344528
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344530
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344533
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344536
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344538
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344541
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344544
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344546
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344547
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344549
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344552
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344555
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344557
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344560
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344563
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344565
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344568
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344571
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344573
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344574
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344576
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344579
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344582
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344584
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344587
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344590
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344592
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344593
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344595
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344598
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344601
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344603
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344606
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344609
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344611
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344614
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457344617
[2016-03-07 10:56:56 +0100] notice/ApiListener: Replayed 0 messages.
[2016-03-07 10:56:56 +0100] information/ApiListener: Finished sending replay log for endpoint 'master.monitoring'.
[2016-03-07 10:56:56 +0100] notice/JsonRpcConnection: Received 'config::Update' message from 'master.monitoring'
[2016-03-07 10:56:56 +0100] notice/ApiListener: Creating config update for file '/var/lib/icinga2/api/zones/global-templates/.timestamp'
[2016-03-07 10:56:56 +0100] notice/ApiListener: Creating config update for file '/var/lib/icinga2/api/zones/global-templates/_etc/command_check_uptime.conf'
[2016-03-07 10:56:56 +0100] information/ApiListener: Restarting after configuration change.
[2016-03-07 10:56:56 +0100] notice/ThreadPool: Pool #1: Pending tasks: 0; Average latency: 0ms; Threads: 4; Pool utilization: 0.00609729%
All the logs are empty and they are "being" replayed to the master.
MASTER
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 'master.monitoring' because that's us.
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 's0101.vie01.local' because the host/port attributes are missing.
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 's0103.vie01.local' because the host/port attributes are missing.
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 's0104.vie01.local' because the host/port attributes are missing.
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 's0108.vie01.local' because the host/port attributes are missing.
[2016-03-07 11:03:32 +0100] debug/ApiListener: Not connecting to Endpoint 's0111.vie01.local' because the host/port attributes are missing.
[2016-03-07 11:03:32 +0100] notice/ApiListener: Skipping sync for 's130.vie01.local'. Not a child of zone 's0117.vie01.local'.
[2016-03-07 11:03:32 +0100] notice/ApiListener: Skipping sync for 's164.vie01.local'. Not a child of zone 's0117.vie01.local'.
[2016-03-07 11:03:32 +0100] information/ApiListener: Syncing runtime objects to endpoint 's0117.vie01.local'.
[2016-03-07 11:03:32 +0100] information/ApiListener: Finished sending updates for endpoint 's0117.vie01.local'.
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258610
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258612
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258613
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258615
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258618
[2016-03-07 11:03:32 +0100] notice/ApiListener: Replaying log: /var/lib/icinga2/api/log/1457258620
... and this goes on forever.. I got about 2 GB on logs in about a few minutes.
Updated by mfriedrich on 2016-03-07 16:42:10 +00:00
Ok, so not necessarily related to modified attributes. What do you mean exactly by "goes on forever"? How often does the client restart? Is there a pattern?
Are these instances using the same version?
Is there a specific reason why the master configuration for the endpoint "object Endpoint "s129" { }" does not specify a host attribute?
Updated by mfriedrich on 2016-03-09 11:39:05 +00:00
One thing which also comes to mind - the config sync depends on the timestamp. Are all your instances using ntp to ensure that their time is fully in sync? I could think of a mismatch there, causing the client to always receive "newer" configuration files than the master. Which then results into restarts on each connection attempt receiving the "updated" configuration.
Updated by mfriedrich on 2016-03-09 11:39:23 +00:00
Updated by akrus on 2016-03-14 12:44:43 +00:00
I have the same problem here, master is 2.4.1. After upgrading slave to 2.4.3 it comes in a loop:
[2016-03-14 14:39:40 +0200] information/ConfigItem: Activated all objects.
[2016-03-14 14:39:40 +0200] information/ConfigCompiler: Compiling config file: /var/lib/icinga2/modified-attributes.conf
[2016-03-14 14:39:40 +0200] information/ApiListener: New client connection for identity 'icinga1'
[2016-03-14 14:39:40 +0200] information/ApiListener: Sending config updates for endpoint 'icinga1'.
[2016-03-14 14:39:40 +0200] information/ApiListener: Syncing runtime objects to endpoint 'icinga1'.
[2016-03-14 14:39:40 +0200] information/ApiListener: Finished sending config updates for endpoint 'icinga1'.
[2016-03-14 14:39:40 +0200] information/ApiListener: Sending replay log for endpoint 'icinga1'.
[2016-03-14 14:39:40 +0200] information/ApiListener: Finished sending replay log for endpoint 'icinga1'.
[2016-03-14 14:39:40 +0200] information/ApiListener: Restarting after configuration change.
[2016-03-14 14:39:43 +0200] information/Application: Got reload command: Starting new instance.
[2016-03-14 14:39:43 +0200] information/Application: Received request to shut down.
[2016-03-14 14:39:43 +0200] information/Application: Shutting down...
[2016-03-14 14:39:43 +0200] information/CheckerComponent: Checker stopped.
[2016-03-14 14:39:43 +0200] information/ConfigItem: Activated all objects.
[2016-03-14 14:39:43 +0200] information/ConfigCompiler: Compiling config file: /var/lib/icinga2/modified-attributes.conf
[2016-03-14 14:39:43 +0200] information/ApiListener: New client connection for identity 'icinga1'
[2016-03-14 14:39:43 +0200] information/ApiListener: Sending config updates for endpoint 'icinga1'.
[2016-03-14 14:39:43 +0200] information/ApiListener: Syncing runtime objects to endpoint 'icinga1'.
[2016-03-14 14:39:43 +0200] information/ApiListener: Finished sending config updates for endpoint 'icinga1'.
[2016-03-14 14:39:43 +0200] information/ApiListener: Sending replay log for endpoint 'icinga1'.
[2016-03-14 14:39:43 +0200] information/ApiListener: Finished sending replay log for endpoint 'icinga1'.
[2016-03-14 14:39:43 +0200] information/ApiListener: Restarting after configuration change.
Master node has this in logs:
[2016-03-14 14:42:33 +0200] warning/TlsStream: TLS stream was disconnected.
[2016-03-14 14:42:33 +0200] warning/JsonRpcConnection: Error while reading JSON-RPC message for identity 'icinga-slave1': Error: std::exception
(0) libbase.so: void boost::throw_exception(icinga::openssl_error const&) (+0x97) [0x2b0fd24f6c67]
(1) libbase.so: void boost::exception_detail::throw_exception_(icinga::openssl_error const&, char const*, char const*, int) (+0x40) [0x2b0fd24f6d10]
(2) libbase.so: icinga::TlsStream::HandleError() const (+0xbc) [0x2b0fd2493d2c]
(3) libbase.so: icinga::TlsStream::Read(void*, unsigned long, bool) (+0x83) [0x2b0fd2493eb3]
(4) libbase.so: icinga::StreamReadContext::FillFromStream(boost::intrusive_ptr const&, bool) (+0x7c) [0x2b0fd24a39ac]
(5) libbase.so: icinga::NetString::ReadStringFromStream(boost::intrusive_ptr const&, icinga::String*, icinga::StreamReadContext&, bool) (+0xce) [0x2b0fd24b20ce]
(6) libremote.so: icinga::JsonRpc::ReadMessage(boost::intrusive_ptr const&, boost::intrusive_ptr*, icinga::StreamReadContext&, bool) (+0x3d) [0x2b0fd4f73cad]
(7) libremote.so: icinga::JsonRpcConnection::ProcessMessage() (+0x65) [0x2b0fd4f949f5]
(8) libremote.so: icinga::JsonRpcConnection::DataAvailableHandler() (+0x38) [0x2b0fd4fb1848]
(9) libbase.so: boost::signals2::detail::signal_impl const&), boost::signals2::optional_last_value, int, std::less, boost::function const&)>, boost::function const&)>, boost::signals2::mutex>::operator()(boost::intrusive_ptr const&) (+0x1cc) [0x2b0fd252af2c]
(10) libbase.so: icinga::Stream::SignalDataAvailable() (+0x30) [0x2b0fd24d9620]
(11) libbase.so: icinga::TlsStream::OnEvent(int) (+0x3a8) [0x2b0fd24d9b08]
(12) libbase.so: icinga::SocketEvents::ThreadProc() (+0x23a) [0x2b0fd24d66aa]
(13) libboost_thread.so.1.54.0: (+0xba4a) [0x2b0fd1b2aa4a]
(14) libpthread.so.0: (+0x8182) [0x2b0fd21af182]
(15) libc.so.6: clone (+0x6d) [0x2b0fd337f47d]
[2016-03-14 14:42:33 +0200] warning/JsonRpcConnection: API client disconnected for identity 'icinga-slave1'
[2016-03-14 14:42:33 +0200] warning/ApiListener: Removing API client for endpoint 'icinga-slave1'. 0 API clients left.
[2016-03-14 14:42:33 +0200] information/ApiListener: New client connection for identity 'icinga-slave1'
[2016-03-14 14:42:33 +0200] information/ApiListener: Sending updates for endpoint 'icinga-slave1'.
[2016-03-14 14:42:33 +0200] information/ApiListener: Syncing global zone 'global' to endpoint 'icinga-slave1'.
[2016-03-14 14:42:33 +0200] information/ApiListener: Syncing zone 'icinga-slave1' to endpoint 'icinga-slave1'.
[2016-03-14 14:42:33 +0200] information/ApiListener: Syncing runtime objects to endpoint 'icinga-slave1'.
[2016-03-14 14:42:33 +0200] information/ApiListener: Finished sending updates for endpoint 'icinga-slave1'.
This is related to: https://dev.icinga.org/issues/11288 Though it takes some time to find out what's wrong :) upgrading master to 2.4.3 fixed the problem as well.
Updated by mfriedrich on 2016-03-16 19:32:08 +00:00
@seferovic
Can you please try to upgrade both your master and your clients to 2.4.4 and check whether the issue is solved? The answer from @akrus sounds promising :)
Updated by mfriedrich on 2016-03-24 09:28:06 +00:00
We consider this being fixed when upgrading the master and then the clients to 2.4.4.
Updated by mfriedrich on 2016-03-24 09:36:25 +00:00
This issue has been migrated from Redmine: https://dev.icinga.com/issues/11140
Created by seferovic on 2016-02-11 12:15:33 +00:00
Assignee: (none) Status: Closed (closed on 2016-03-24 09:28:06 +00:00) Target Version: (none) Last Update: 2016-03-24 09:36:24 +00:00 (in Redmine)
I installed the latest snapshot in order to provide feedback for another bug, but my Icinga2 instance is running wild. It keeps reloading. I am using it as a command exec bridge, so no hosts or services are locally defined.