Closed elondaits closed 6 years ago
i really need to know where this data comes from (which query is used to get it).
there are 2 checks / services that provide APP_ID: dockapp_top1
and dockapp_heartbeat
.
Where the state
and state_type
come from?
What I copied above is the merge of two queries. First I do
.get('hosts')
.columns(['name', 'state', 'state_type'])
.asColumns(['id', 'state', 'state_type'])
and then
.get('services')
.columns(['host_name', 'state', 'state_type', 'plugin_output'])
.asColumns(['id', 'app_state', 'app_state_type', 'app_id'])
.filter('description = dockapp_top1')
so state
and state_type
come from the HOSTS query. The syntax above is of my wrapper functions but it's a direct map to the CheckMK query syntax... get('hosts') gets translated to GET hosts
, etc.
@elondaits thanks. i will try to recover the monitoring history. please let me know ASAP if you see this effect again!
I won't test in the HITS server again until I have something new to test, so I won't see it soon probably. I assume that if you try it yourself you'll see the same thing, since I didn't do anything special, just stop the server with the dashboard and wait.
OK, the problem is that Intel AMT was configured to provide PING responses on those hosts, but was missing a few pings sometimes. Therefore OMD was able to PING them, but with some lost PINGs. Apparently host state is determined by OMD/CheckMK depending on whether host can be PING'ed or not. Therefore hosts were switching between UP and DOWN state...
Easy solution: disable AMT pings via http://HOST_NAME_OR_IP:16992/ip.htm (user is usually admin
+ some pre-configured password is required).
NOTE that http
!
ps: https://www.symantec.com/connect/articles/who-responding-ping-intel-amt-or-os
After stopping bigfoot80 I received "station down" and "station up" via CheckMK intermittently... it'd change like every minute.
The state would change from:
to: