kuhn-ruess / Checkmk-Checks

Checks and Stuff for Check_MK
MIT License
39 stars 21 forks source link

get_cluster_status off by one error #68

Closed archer31 closed 1 month ago

archer31 commented 1 year ago

the get_cluster_status function checks that the services are running/not dead. but it appears that the check makes sure that there are at least 2 pids for each service rather than at least one. Is there a reason behind checking for 2 over one, because the plugin is reporting failures when there is none?

Bastian-Kuhn commented 1 year ago

Hello, which check you talking about?

archer31 commented 1 year ago

This is for the Node Status check. My systems report that these four services are failed: logwatcher, nexus, statscollector, tricorder. But when i go to the actual system, systemctl reports that the services are running.

Bastian-Kuhn commented 1 year ago

Hello,

in this repo is no check with node in name. I would need the plugin name.

archer31 commented 1 year ago

I am talking about this line here

Bastian-Kuhn commented 1 year ago

Thank you, is fixed. If you please could test with the newest version?

archer31 commented 1 year ago

looks like that does not fix the issue. i am getting a parse failed error Parsing of section cohesity_node_status failed WARN here is the output of that section. possibly is an issue with the failed line not listing any services.

<<<cohesity_node_status>>>
host4 failed 
host4 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host3 failed 
host3 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host2 failed 
host2 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host1 failed 
host1 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda

This could probably be fixed just by adding some conditions to these two lines here

Bastian-Kuhn commented 11 months ago

But how should the check Handle this failed state? Adding conditions would just ignore the error or not?

Bastian-Kuhn commented 1 month ago

Hello @archer31 that should now be finally fixed.