Closed raffis closed 4 years ago
Please see #6012.
Please see #6012.
Nice. fast response! Looks like #5205/#6927 (or the mentioned parent task).
So basically I need to trigger lots of restarts since kube-icinga may add lots of objects (also removing them first since there is no way to trigger apply rules for changed objects...)
Only workaround is to trigger a restart?
Up until the underlaying problem inside the IDO feature is fixed, a restart is the only workaround, yes.
Up until the underlaying problem inside the IDO feature is fixed, a restart is the only workaround, yes.
Whats the difference between a service reload via init and a POST /v1/actions/restart-process ?
After sending a POST /v1/actions/restart-process my object list in icingaweb is empty and adding the same object again ends in error 500:
"Cannot create object 'test10'. Configuration file '/var/lib/icinga2/api/packages/_api//conf.d/servicegroups/test10.conf' already exists."
(Which is a different error compared to just do restart via systemd)
Sending a POST /v1/actions/restart-process would be the only workaround for my app. Otherwise this is gonna be impossible with the actual version of the icinga api.
[2019-02-19 15:59:11 +0000] information/HttpServerConnection: Request: POST /v1/actions/restart-process (from [172.19.0.1]:50396), user: icinga2-director)
[2019-02-19 15:59:11 +0000] information/HttpServerConnection: HTTP client disconnected (from [172.19.0.1]:50396)
[2019-02-19 15:59:12 +0000] information/Application: Got reload command: Starting new instance.
[2019-02-19 15:59:12 +0000] information/Application: Reload requested, letting new process take over.
[2019-02-19 15:59:12 +0000] information/ApiListener: 'api' stopped.
[2019-02-19 15:59:12 +0000] information/CheckerComponent: 'checker' stopped.
[2019-02-19 15:59:12 +0000] information/CompatLogger: 'compatlog' stopped.
[2019-02-19 15:59:12 +0000] information/ExternalCommandListener: 'command' stopped.
[2019-02-19 15:59:13 +0000] information/FileLogger: 'main-log' started.
[2019-02-19 15:59:13 +0000] information/ApiListener: 'api' started.
[2019-02-19 15:59:13 +0000] information/ApiListener: Copying 2 zone configuration files for zone 'director-global' to '/var/lib/icinga2/api/zones/director-global'.
[2019-02-19 15:59:13 +0000] information/ApiListener: Applying configuration file update for path '/var/lib/icinga2/api/zones/director-global' (0 Bytes). Received timestamp '2019-02-19 15:59:13 +0000' (1550591953.329891), Current timestamp '2019-02-19 15:52:47 +0000' (1550591567.355288).
[2019-02-19 15:59:13 +0000] information/ApiListener: Copying 1 zone configuration files for zone 'master' to '/var/lib/icinga2/api/zones/master'.
[2019-02-19 15:59:13 +0000] information/ApiListener: Applying configuration file update for path '/var/lib/icinga2/api/zones/master' (0 Bytes). Received timestamp '2019-02-19 15:59:13 +0000' (1550591953.330238), Current timestamp '2019-02-19 15:52:47 +0000' (1550591567.355010).
[2019-02-19 15:59:13 +0000] information/ApiListener: Started new listener on '[0.0.0.0]:5665'
[2019-02-19 15:59:13 +0000] information/ExternalCommandListener: 'command' started.
[2019-02-19 15:59:13 +0000] information/GraphiteWriter: 'graphite' started.
[2019-02-19 15:59:13 +0000] information/LivestatusListener: 'livestatus' started.
[2019-02-19 15:59:13 +0000] information/LivestatusListener: Created UNIX socket in '/run/icinga2/cmd/livestatus'.
[2019-02-19 15:59:13 +0000] information/CheckerComponent: 'checker' started.
[2019-02-19 15:59:13 +0000] information/NotificationComponent: 'notification' started.
[2019-02-19 15:59:13 +0000] information/DbConnection: 'ido-mysql' started.
[2019-02-19 15:59:13 +0000] information/CompatLogger: 'compatlog' started.
"Cannot create object 'test10'. Configuration file '/var/lib/icinga2/api/packages/_api//conf.d/servicegroups/test10.conf' already exists."
It misses the stage name after _api/
, so highly likely the API package got broken somehow in the process of restarting.
"Cannot create object 'test10'. Configuration file '/var/lib/icinga2/api/packages/_api//conf.d/servicegroups/test10.conf' already exists."
It misses the stage name after
_api/
, so highly likely the API package got broken somehow in the process of restarting.
Argh my fault, I have removed content in /var/lib/icinga2/api/packages/_api manually during debuging and just noticed that files like active-stage.conf, active.conf were missing after restart. But if I create new objects via the api those get created in conf.d folder directly in _api, /var/lib/icinga2/api/packages/_api/conf.d/xxxx. And after restart the service the added services are gone again (But files still there).
Maybe a check for that would be helpful (or a log entry somewhere that the stage folder is gone or not active.) Probably the api should respond with a 500 error and not accepting new objects in the first place.
I've created #6959 as follow-up. I just don't have the time to code any further here, maybe you'd like to catch up on this.
I've created #6959 as follow-up. I just don't have the time to code any further here, maybe you'd like to catch up on this.
:+1:, yes as soon as I have some spare time.
Will be superseded with IcingaDB, the old tracking for the IDO is #6012.
Creating new objects does not reflect on the web interface (And i'm pretty sure they do not get checked as well).
The objects are visible after manually restarting icinga.
The problem look quite random (See steps to reproduce):
Again only restarting icinga solves this problem.
Expected Behaviour
New objects (10 test servicegroups (Can be any object types)) are visible in icingaweb.
Current Behaviour
Servicegroups are not visible in the servicegroup list in the icinga web ui. As far as I can see the web ui fetches its information not from the api but from the mysql db directly.
GET https://localhost:5665/v1/objects/servicegroups
lists all those objects also doesicinga2 object list
.As soon as I restart icinga the objects are visible in the web ui.
Possible Solution
Not sure how this can happen but it looks like a major problem.
Steps to Reproduce (for bugs)
Create file /tmp/test:
cat /tmp/test | while read l; do sh -c "$l"; done
Objects are not visible in the web ui. (You may need to do this a couple of time since this is not always the case but mostly)
Chances are higher to get some objects if waiting a short time between requests: cat /tmp/test | while read l; do sh -c "$l"; sleep 2; done
Context
I have this issue in kube-icinga https://github.com/gyselroth/kube-icinga. This async app does create many api calls within a short time and even async.
Your Environment
icinga2 --version
): r2.10.2-1icinga2 feature list
):icinga2 daemon -C
):zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.