Closed geo-geek closed 6 years ago
Hello,
I have tried but failed to reproduce the problem you encountered. :-(
Could you try the latest version? I have improved logging a bit so the problem should be easier to see. Also, please apply deploy/rbac.yaml
again which should fix the RBAC DENY message you got.
When you create the deployment again, there should be a host object. kubectl describe host myapp -n mynamespace
should show the objects details and all relevant events, for example a failure to create the icinga host object.
Also, you can set LOG_LEVEL=debug
and ICINGA_DEBUG=true
to get even more debugging information and also a dump of all Icinga2 requests and responses.
Thanks for your quick response. I'll have a look at the new version later and see if it produces any more logs.
In the meantime I found another interesting behaviour. If you delete the deployment from Kubernetes dashboard it removes the host from icinga. This is what I did to cause the issue.
If you delete the deployment using kubectl delete -f mydeploymentfile.yml it doesn't remove the icinga host. But I've not had chance to investigate further.
Can you confirm the best way to cleanly remove a k8s deployment that is being monitored by your project?
I've not updated to the latest version yet but here is the description of the host (Current State: Deployment has been recreated but the icinga host hasn't been created):
kubectl describe host deploy-mydeployment -n mynamespace
Name: deploy-mydeployment
Namespace: mynamespace
Labels: <none>
Annotations: <none>
API Version: icinga.nexinto.com/v1
Kind: Host
Metadata:
Cluster Name:
Creation Timestamp: 2018-09-11T21:56:57Z
Owner References:
API Version: v1beta1
Kind: Deployment
Name: mydeployment
UID: 9a0bf440-b60d-11e8-8b63-005056bd3b89
Resource Version: 687291
Self Link: /apis/icinga.nexinto.com/v1/namespaces/mynamespace/hosts/deploy-mydeployment
UID: 9a10bd06-b70d-11e8-af0e-005056bd1107
Spec:
Check _ Command: check_kubernetes
Hostgroups:
gitlab-ci
Name: mynamespace.deploy-mydeployment
Notes:
Notesurl:
Vars:
Kubernetes _ Cluster: kubernetes
Kubernetes _ Name: mydeployment
Kubernetes _ Namespace: mynamespace
Kubernetes _ Type: deployment
Status:
Events: <none>
Deleting the deployment should have been enough; however sometimes that does not work (that's a bug - haven't figured out when/why yet). If that happens to you, check if there still is the Host object in Icinga (that's the one you described above). Delete it and it should be deleted in Icinga as well. If that doesn't work, it's probably because the Icinga API call failed, but that should appear in the log.
It shouldn't matter if you delete the deployment from the dashboard or using kubectl because in the background the same thing happens.
So after going through the processes of adding/removing deployments a few times I noticed that: 1) after removing a deployment -> it removed the host almost straight away. 2) after creating a fresh host or re-creating an existing host -> it appeared in icinga but after about 7 or 8 mins.
So it is recreating the host! ๐
I suppose the follow-up question: is it normal to have to wait 7 or 8 minutes between deployment creation and the icinga api being called for host creation?
Thanks for your help.
Here is the sequence of events from the logfile.
Now I've enabled all the extra logging options, the logfile was way too verbose, so I have filtered the log file based on my deployment name 'my-deployment-test'
As you can see there is approximately 7 minutes between 'creating' the host and 'processing' the host. At which point it calls the icinga api to create the host in icinga.
If this is expected behaviour it might be worth changing the log message from 'creating host' to 'creating internal host object' and adding a new info level log message when the host is created in icinga (e.g. 'creating icinga host object') as I totally misunderstood the logfile!
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=info msg="creating host 'my-namespace/deploy-my-deployment-test'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing pod 'my-namespace/my-deployment-test-6b656cf96f-s888b'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-test-6b656cf96f'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-test-6b656cf96f'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing pod 'my-namespace/my-deployment-test-6b656cf96f-s888b'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-test-6b656cf96f'"
September 20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=debug msg="processing pod 'my-namespace/my-deployment-test-6b656cf96f-s888b'"
September 20th 2018, 11:38:01.000 | time="2018-09-20T10:38:01Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-test-6b656cf96f'"
September 20th 2018, 11:38:01.000 | time="2018-09-20T10:38:01Z" level=debug msg="processing pod 'my-namespace/my-deployment-test-6b656cf96f-s888b'"
September 20th 2018, 11:38:01.000 | time="2018-09-20T10:38:01Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:39:59.000 | time="2018-09-20T10:39:59Z" level=debug msg="processing deployment 'my-namespace/my-deployment-test'"
September 20th 2018, 11:40:02.000 | time="2018-09-20T10:40:02Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-test-6b656cf96f'"
September 20th 2018, 11:40:07.000 | time="2018-09-20T10:40:07Z" level=debug msg="processing pod 'my-namespace/my-deployment-test-6b656cf96f-s888b'"
September 20th 2018, 11:44:27.000 | time="2018-09-20T10:44:27Z" level=debug msg="processing host 'my-namespace/deploy-my-deployment-test'"
September 20th 2018, 11:44:27.000 | 2018/09/20 10:44:27 URL: https://icinga.myurl:5665/v1/objects/hosts/kubernetes.my-namespace.deploy-my-deployment-test
I still had one host not appearing in Icinga even though the call had gone through the API ok.
The config was present here: /var/lib/icinga2/api/packages/_api/491bb0f8-f10c-436a-abac-30d571e0bf58/conf.d/hosts/kubernetes.my-namespace.deploy-my-deployment-test.conf
I restarted the main Icinga service and it picked it up fine. Very strange.
There also seems to existent an incombability with Icinga Director. As soon as you deploy a config changer there, all objects created from kubernetes-icinga vanish. They still exist in the api folder. If you cear "/var/lib/icinga2/api/packages/_api/conf.d" the deployment will recreate them and they also appear again in the ui til the next director deployment.
Not sure how I can help with all internal Icinga weirdness here. kubernetes-icinga can only use what the Icinga API considers the truth. #4 might be a solution, I don't know if/when I'll get to that though...
@geo-geek creating the CR host object should immediately create the Icinga host (or try to). That should have happened immediately at the first log line:
20th 2018, 11:37:55.000 | time="2018-09-20T10:37:55Z" level=info msg="creating host 'my-namespace/deploy-my-deployment-test'"
Did you get any Icinga API traffic for that? It should at least do something like check if the host already exists in Icinga.
The 7 minute delay you see is the periodic (all 5 minutes) rescan of all relevant resources (both Workload and the custom resources for Hostgroups and Hosts) and your change was picked up then as it seems to have failed the first time.
I agree about the log message not being clear about the difference between working with the custom resource and the Icinga object. I'll try and find a way to improve that. Right now something like "creating icinga something" is logged when talking to the Icinga API.
So I have a bit more information which might help. This time the icinga host was only created when I happened to update the pod.
The deployment was created at 11:08 (icinga host not created). I made a change to the pod and redeployed at 11:17 (icinga host created):
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=info msg="creating host 'my-namespace/deploy-my-deployment'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:08:43.000 time="2018-09-25T10:08:43Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:08:49.000 time="2018-09-25T10:08:49Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:08:49.000 time="2018-09-25T10:08:49Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:08:49.000 time="2018-09-25T10:08:49Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:11:21.000 time="2018-09-25T10:11:21Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:11:31.000 time="2018-09-25T10:11:31Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:11:35.000 time="2018-09-25T10:11:35Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:16:12.000 time="2018-09-25T10:16:12Z" level=debug msg="processing replicaset 'my-namespace/my-deployment-78cf57485f'"
September 25th 2018, 11:16:26.000 time="2018-09-25T10:16:26Z" level=debug msg="processing deployment 'my-namespace/my-deployment'"
September 25th 2018, 11:16:26.000 time="2018-09-25T10:16:26Z" level=debug msg="processing pod 'my-namespace/my-deployment-78cf57485f-gslsp'"
September 25th 2018, 11:16:53.000 time="2018-09-25T10:16:53Z" level=debug msg="processing host 'my-namespace/deploy-my-deployment'"
September 25th 2018, 11:17:08.000 time="2018-09-25T10:17:08Z" level=info msg="creating icinga host 'kubernetes.my-namespace.deploy-my-deployment'"
I've started looking at the code and it looks like the 'HostCreatedOrUpdated' function is not called as we don't see a log entry for 'processing host' until when I re-deployed. No error messages are being reported when on debug level. I'm guessing the error handling would throw something to the log on error?
What's also interesting and might be causing the delay is that 'updating icinga host' is being continuously called for each registered host. In total I'm seeing about 200 'updating icinga host' an hour appearing in the logs. Even though I'm not making any changes to these hosts. Is that normal behaviour?
I have found and fixed a bug where deleted workload would cause the "Host" custom resource to be recreated immediately. I also added a regular process where such objects are cleaned up if any remain, and I have reduced the resync time to 60 seconds (down from 300) so the current state gets "repaired" sooner. Not sure that explains or fixes the weird behaviour you describe, but please try again with the latest version. If "your" bug does not go away I hope it's easier to see. The 'updating icinga host' message all the time should not happen. It means that the Icinga Host as seen by the API is not what kubernetes-icinga expects and it tries to fix it. If you still get those messages, could you enable Icinga logging and paste the log from immediately before the update message (GET icinga/.../hosts/HOST) and after where it the object is POSTed back to Icinga?
I've updated again and still seeing the same 10 minute delay in adding a host. I've filtered the logfile based on 'mydeployment' name
September 27th 2018, 10:57:15.000 time="2018-09-27T09:57:15Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 10:57:16.000 time="2018-09-27T09:57:16Z" level=info msg="creating host cr 'mynamespace/deploy-mydeployment'"
September 27th 2018, 10:57:19.000 time="2018-09-27T09:57:19Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 10:57:23.000 time="2018-09-27T09:57:23Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 10:58:09.000 time="2018-09-27T09:58:09Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 10:58:15.000 time="2018-09-27T09:58:15Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 10:58:19.000 time="2018-09-27T09:58:19Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 10:58:24.000 time="2018-09-27T09:58:24Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 10:59:04.000 time="2018-09-27T09:59:04Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 10:59:13.000 time="2018-09-27T09:59:13Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 10:59:19.000 time="2018-09-27T09:59:19Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 10:59:35.000 time="2018-09-27T09:59:35Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:00:08.000 time="2018-09-27T10:00:08Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:00:12.000 time="2018-09-27T10:00:12Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:00:12.000 time="2018-09-27T10:00:12Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:00:45.000 time="2018-09-27T10:00:45Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:01:12.000 time="2018-09-27T10:01:12Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:01:14.000 time="2018-09-27T10:01:14Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:01:16.000 time="2018-09-27T10:01:16Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:01:54.000 time="2018-09-27T10:01:54Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:02:03.000 time="2018-09-27T10:02:03Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:02:06.000 time="2018-09-27T10:02:06Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:02:18.000 time="2018-09-27T10:02:18Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:03:05.000 time="2018-09-27T10:03:05Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:03:10.000 time="2018-09-27T10:03:10Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:03:12.000 time="2018-09-27T10:03:12Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:03:21.000 time="2018-09-27T10:03:21Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:04:02.000 time="2018-09-27T10:04:02Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:04:07.000 time="2018-09-27T10:04:07Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:04:07.000 time="2018-09-27T10:04:07Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:04:35.000 time="2018-09-27T10:04:35Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:05:04.000 time="2018-09-27T10:05:04Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:05:09.000 time="2018-09-27T10:05:09Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:05:12.000 time="2018-09-27T10:05:12Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:05:45.000 time="2018-09-27T10:05:45Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:06:09.000 time="2018-09-27T10:06:09Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:06:12.000 time="2018-09-27T10:06:12Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:06:16.000 time="2018-09-27T10:06:16Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:06:54.000 time="2018-09-27T10:06:54Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:07:04.000 time="2018-09-27T10:07:04Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:07:12.000 time="2018-09-27T10:07:12Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:07:18.000 time="2018-09-27T10:07:18Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:08:04.000 2018/09/27 10:08:04 URL: https://icinga.myhost.com:5665/v1/objects/hosts/kubernetes.mynamespace.deploy-mydeployment
September 27th 2018, 11:08:04.000 time="2018-09-27T10:08:04Z" level=debug msg="processing host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:08:12.000 time="2018-09-27T10:08:12Z" level=debug msg="[crhousekeeping] checking host 'mynamespace/deploy-mydeployment'"
September 27th 2018, 11:08:12.000 time="2018-09-27T10:08:12Z" level=debug msg="processing replicaset 'mynamespace/mydeployment-7fdfb4497'"
September 27th 2018, 11:08:14.000 time="2018-09-27T10:08:14Z" level=debug msg="processing deployment 'mynamespace/mydeployment'"
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 URL: https://icinga.myhost.com:5665/v1/objects/hosts/kubernetes.mynamespace.deploy-mydeployment
September 27th 2018, 11:08:19.000 "kubernetes_owner": "mynamespace/deploy-mydeployment",
September 27th 2018, 11:08:19.000 time="2018-09-27T10:08:19Z" level=info msg="creating icinga host 'kubernetes.mynamespace.deploy-mydeployment'"
September 27th 2018, 11:08:19.000 "display_name": "kubernetes.mynamespace.deploy-mydeployment",
September 27th 2018, 11:08:19.000 "kubernetes_name": "mydeployment",
September 27th 2018, 11:08:21.000 time="2018-09-27T10:08:21Z" level=debug msg="processing pod 'mynamespace/mydeployment-7fdfb4497-p9hvk'"
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 Status: 404
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 Header:
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 {
September 27th 2018, 11:08:19.000 ],
September 27th 2018, 11:08:19.000 "status": "No objects found."
September 27th 2018, 11:08:19.000 }
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 REQUEST
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 URL: https://icinga.myhost.com:5665/v1/objects/hosts/kubernetes.mynamespace.deploy-mydeployment
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 Header: map[Content-Type:[application/json] Accept:[application/json] Authorization:[Basic dfsgdfgds]]
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 Form: map[]
September 27th 2018, 11:08:19.000 "notes_url": "",
September 27th 2018, 11:08:19.000 "kubernetes_owner": "mynamespace/deploy-mydeployment",
September 27th 2018, 11:08:19.000 "groups": [
September 27th 2018, 11:08:19.000 ]
September 27th 2018, 11:08:19.000 }
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 --------------------------------------------------------------------------------
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 Status: 200
September 27th 2018, 11:08:19.000 "Content-Type": [
September 27th 2018, 11:08:19.000 "Server": [
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 --------------------------------------------------------------------------------
September 27th 2018, 11:08:19.000 }
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 --------------------------------------------------------------------------------
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 --------------------------------------------------------------------------------
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 RESPONSE
September 27th 2018, 11:08:19.000 2018/09/27 10:08:19 {
September 27th 2018, 11:08:19.000 ],
September 27th 2018, 11:08:19.000 ]
With regards to the repeating updating hosts here are two examples and the associated full logs:
kubernetes.my-namespace.deploy-mydeployment
time="2018-09-27T09:37:24Z" level=debug msg="processing node 'k8s-01'"
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 RESPONSE
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 Status: 200
2018/09/27 09:37:27 Header:
2018/09/27 09:37:27 {
"Content-Type": [
"application/json"
],
"Server": [
"Icinga/r2.9.1-1"
]
}
2018/09/27 09:37:27 Body:
2018/09/27 09:37:27 {
"results": [
{
"attrs": {
"__name": "kubernetes.my-namespace.deploy-mydeployment",
"acknowledgement": 0.0,
"acknowledgement_expiry": 0.0,
"action_url": "",
"active": true,
"address": "",
"address6": "",
"check_attempt": 1.0,
"check_command": "check_kubernetes",
"check_interval": 60.0,
"check_period": "",
"check_timeout": null,
"command_endpoint": "",
"display_name": "kubernetes.my-namespace.deploy-mydeployment",
"downtime_depth": 0.0,
"enable_active_checks": true,
"enable_event_handler": true,
"enable_flapping": false,
"enable_notifications": true,
"enable_passive_checks": true,
"enable_perfdata": true,
"event_command": "",
"flapping": false,
"flapping_current": 0.0,
"flapping_last_change": 0.0,
"flapping_threshold": 0.0,
"flapping_threshold_high": 30.0,
"flapping_threshold_low": 25.0,
"force_next_check": false,
"force_next_notification": false,
"groups": [
"kubernetes.my-namespace",
"email-notify"
],
"ha_mode": 0.0,
"icon_image": "",
"icon_image_alt": "",
"last_check": 1538041017.9721460342,
"last_check_result": {
"active": true,
"check_source": "icinga.myhost.com",
"command": [
"/usr/lib/nagios/plugins/check_kubernetes",
"-kubeconfig",
"/usr/lib/nagios/plugins/config",
"-name",
"mydeployment",
"-namespace",
"my-namespace",
"-type",
"deployment"
],
"execution_end": 1538041017.9720079899,
"execution_start": 1538041017.8403611183,
"exit_status": 0.0,
"output": "OK: 1/1 replicas available, 1/1 replicas up to spec and running",
"performance_data": [],
"schedule_end": 1538041017.9721460342,
"schedule_start": 1538041017.8399999142,
"state": 0.0,
"ttl": 0.0,
"type": "CheckResult",
"vars_after": {
"attempt": 1.0,
"reachable": true,
"state": 0.0,
"state_type": 1.0
},
"vars_before": {
"attempt": 1.0,
"reachable": true,
"state": 0.0,
"state_type": 1.0
}
},
"last_hard_state": 0.0,
"last_hard_state_change": 1538039806.5801830292,
"last_reachable": true,
"last_state": 0.0,
"last_state_change": 1538039806.5801830292,
"last_state_down": 0.0,
"last_state_type": 1.0,
"last_state_unreachable": 0.0,
"last_state_up": 1538041017.9721870422,
"max_check_attempts": 3.0,
"name": "kubernetes.my-namespace.deploy-mydeployment",
"next_check": 1538041077.8399999142,
"notes": "",
"notes_url": "",
"original_attributes": null,
"package": "_api",
"paused": false,
"retry_interval": 30.0,
"severity": 8.0,
"source_location": {
"first_column": 0.0,
"first_line": 1.0,
"last_column": 65.0,
"last_line": 1.0,
"path": "/var/lib/icinga2/api/packages/_api/491bb0f8-f00c-436a-abac-30d571e0bf58/conf.d/hosts/kubernetes.my-namespace.deploy-mydeployment.conf"
},
"state": 0.0,
"state_type": 1.0,
"templates": [
"kubernetes.my-namespace.deploy-mydeployment",
"generic-host"
],
"type": "Host",
"vars": {
"kubernetes_cluster": "kubernetes",
"kubernetes_name": "mydeployment",
"kubernetes_namespace": "my-namespace",
"kubernetes_owner": "my-namespace/deploy-mydeployment",
"kubernetes_type": "deployment"
},
"version": 1538039805.0731899738,
"volatile": false,
"zone": "master"
},
"joins": {},
"meta": {},
"name": "kubernetes.my-namespace.deploy-mydeployment",
"type": "Host"
}
]
}
time="2018-09-27T09:37:27Z" level=info msg="updating icinga host 'kubernetes.my-namespace.deploy-mydeployment'"
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 REQUEST
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 Method: POST
2018/09/27 09:37:27 URL: https://icinga.myhost.com:5665/v1/objects/hosts/kubernetes.my-namespace.deploy-mydeployment
2018/09/27 09:37:27 Header: map[Content-Type:[application/json] Accept:[application/json] Authorization:[Basic jkdsjghkdfhgjkdfs]]
2018/09/27 09:37:27 Form: map[]
2018/09/27 09:37:27 Payload:
2018/09/27 09:37:27 {
"templates": null,
"attrs": {
"display_name": "kubernetes.my-namespace.deploy-mydeployment",
"check_command": "check_kubernetes",
"notes": "",
"notes_url": "",
"vars": {
"kubernetes_cluster": "kubernetes",
"kubernetes_name": "mydeployment",
"kubernetes_namespace": "my-namespace",
"kubernetes_owner": "my-namespace/deploy-mydeployment",
"kubernetes_type": "deployment"
}
}
}
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 RESPONSE
2018/09/27 09:37:27 --------------------------------------------------------------------------------
2018/09/27 09:37:27 Status: 200
2018/09/27 09:37:27 Header:
2018/09/27 09:37:27 {
"Content-Type": [
"application/json"
],
"Server": [
"Icinga/r2.9.1-1"
]
}
2018/09/27 09:37:27 Body:
2018/09/27 09:37:27 {
"results": [
{
"code": 200.0,
"name": "kubernetes.my-namespace.deploy-mydeployment",
"status": "Attributes updated.",
"type": "Host"
}
]
}
and kubernetes.nodes.k8s-03
time="2018-09-27T09:47:23Z" level=debug msg="processing node 'k8s-02'"
time="2018-09-27T09:47:23Z" level=debug msg="processing node 'k8s-03'"
time="2018-09-27T09:47:26Z" level=debug msg="processing node 'k8s-01'"
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 RESPONSE
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 Status: 200
2018/09/27 09:47:29 Header:
2018/09/27 09:47:29 {
"Content-Type": [
"application/json"
],
"Server": [
"Icinga/r2.9.1-1"
]
}
2018/09/27 09:47:29 Body:
2018/09/27 09:47:29 {
"results": [
{
"attrs": {
"__name": "kubernetes.nodes.k8s-03",
"acknowledgement": 0.0,
"acknowledgement_expiry": 0.0,
"action_url": "",
"active": true,
"address": "",
"address6": "",
"check_attempt": 1.0,
"check_command": "check_kubernetes",
"check_interval": 60.0,
"check_period": "",
"check_timeout": null,
"command_endpoint": "",
"display_name": "kubernetes.nodes.k8s-03",
"downtime_depth": 0.0,
"enable_active_checks": true,
"enable_event_handler": true,
"enable_flapping": false,
"enable_notifications": true,
"enable_passive_checks": true,
"enable_perfdata": true,
"event_command": "",
"flapping": false,
"flapping_current": 0.0,
"flapping_last_change": 0.0,
"flapping_threshold": 0.0,
"flapping_threshold_high": 30.0,
"flapping_threshold_low": 25.0,
"force_next_check": false,
"force_next_notification": false,
"groups": [
"kubernetes.nodes",
"email-notify"
],
"ha_mode": 0.0,
"icon_image": "",
"icon_image_alt": "",
"last_check": 1538041622.9742019176,
"last_check_result": {
"active": true,
"check_source": "icinga.myhost.com",
"command": [
"/usr/lib/nagios/plugins/check_kubernetes",
"-kubeconfig",
"/usr/lib/nagios/plugins/config",
"-name",
"k8s-03",
"-namespace",
"",
"-type",
"node"
],
"execution_end": 1538041622.9740879536,
"execution_start": 1538041622.9004039764,
"exit_status": 0.0,
"output": "OK: node ready",
"performance_data": [],
"schedule_end": 1538041622.9742019176,
"schedule_start": 1538041622.9000000954,
"state": 0.0,
"ttl": 0.0,
"type": "CheckResult",
"vars_after": {
"attempt": 1.0,
"reachable": true,
"state": 0.0,
"state_type": 1.0
},
"vars_before": {
"attempt": 1.0,
"reachable": true,
"state": 0.0,
"state_type": 1.0
}
},
"last_hard_state": 0.0,
"last_hard_state_change": 1537966654.1813340187,
"last_reachable": true,
"last_state": 0.0,
"last_state_change": 1537966654.1813340187,
"last_state_down": 1537272574.5687580109,
"last_state_type": 1.0,
"last_state_unreachable": 0.0,
"last_state_up": 1538041622.9742360115,
"max_check_attempts": 3.0,
"name": "kubernetes.nodes.k8s-03",
"next_check": 1538041682.8999998569,
"notes": "",
"notes_url": "",
"original_attributes": {
"check_command": "check_kubernetes",
"display_name": "kubernetes.nodes.k8s-03",
"notes": "",
"notes_url": "",
"vars": {
"kubernetes_cluster": "kubernetes",
"kubernetes_name": "k8s-03",
"kubernetes_namespace": "",
"kubernetes_owner": "kube-system/k8s-03",
"kubernetes_type": "node"
}
},
"package": "_api",
"paused": false,
"retry_interval": 30.0,
"severity": 8.0,
"source_location": {
"first_column": 0.0,
"first_line": 1.0,
"last_column": 36.0,
"last_line": 1.0,
"path": "/var/lib/icinga2/api/packages/_api/491bb0f8-f00c-436a-abac-30d571e0bf58/conf.d/hosts/kubernetes.nodes.k8s-03.conf"
},
"state": 0.0,
"state_type": 1.0,
"templates": [
"kubernetes.nodes.k8s-03",
"generic-host"
],
"type": "Host",
"vars": {
"kubernetes_cluster": "kubernetes",
"kubernetes_name": "k8s-03",
"kubernetes_namespace": "",
"kubernetes_owner": "kube-system/k8s-03",
"kubernetes_type": "node"
},
"version": 1538040987.1073510647,
"volatile": false,
"zone": "master"
},
"joins": {},
"meta": {},
"name": "kubernetes.nodes.k8s-03",
"type": "Host"
}
]
}
time="2018-09-27T09:47:29Z" level=info msg="updating icinga host 'kubernetes.nodes.k8s-03'"
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 REQUEST
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 Method: POST
2018/09/27 09:47:29 URL: https://icinga.myhost.com:5665/v1/objects/hosts/kubernetes.nodes.k8s-03
2018/09/27 09:47:29 Header: map[Accept:[application/json] Authorization:[Basic dfgdsfg] Content-Type:[application/json]]
2018/09/27 09:47:29 Form: map[]
2018/09/27 09:47:29 Payload:
2018/09/27 09:47:29 {
"templates": null,
"attrs": {
"display_name": "kubernetes.nodes.k8s-03",
"check_command": "check_kubernetes",
"notes": "",
"notes_url": "",
"vars": {
"kubernetes_cluster": "kubernetes",
"kubernetes_name": "k8s-03",
"kubernetes_namespace": "",
"kubernetes_owner": "kube-system/k8s-03",
"kubernetes_type": "node"
}
}
}
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 RESPONSE
2018/09/27 09:47:29 --------------------------------------------------------------------------------
2018/09/27 09:47:29 Status: 200
2018/09/27 09:47:29 Header:
2018/09/27 09:47:29 {
"Content-Type": [
"application/json"
],
"Server": [
"Icinga/r2.9.1-1"
]
}
2018/09/27 09:47:29 Body:
2018/09/27 09:47:29 {
"results": [
{
"code": 200.0,
"name": "kubernetes.nodes.k8s-03",
"status": "Attributes updated.",
"type": "Host"
}
]
}
Ah, I think I know what the problem is. Did you manually add your hosts to the "email-notify" group in Icinga?
I have a hostgroup assign rule setup to include all hosts into the 'email-notify' hostgroup. So I'm guessing they get assigned automatically as soon as they are added via the api.
I only did this for testing something, so I could remove the rule.
After removing the hostgroup assign rule, it is now adding the hosts instantly. ๐ ๐
It maybe useful to have the functionality but I can live without it for now.
Great project. Thanks for your help ๐ฅ
Great. ;-) In the meantime, I have added a little fix that ignores those additional groups. We cannot change those anyway without removing and recreating the host itself which is not going to happen anyway so your workflow with the additional group should be possible now.
Hi there,
Really interesting project but just hit this issue:
When you remove a k8s deployment that is being monitored, the deployment (host) gets removed from icinga. However, recreating the deployment with the same name doesn't seem to recreate the deployment (host) in icinga.
Creating new deployments (hosts) with a different name works fine.
Digging through all the aggregated log files of all the containers I've come up with a few interesting entries (RBAC DENY, http 403, 404) but not sure if these are correlated:
Many thanks, Wayne