krkn-chaos / cerberus

Guardian of Kubernetes clusters. Tool to monitor clusters health and signal/alert on failures.
Apache License 2.0
92 stars 42 forks source link

Erroneous collecting "clusterversion" In case of openshift distribution #133

Open ratsuf opened 3 years ago

ratsuf commented 3 years ago

Good day! I've tried to deploy Cerberus with distribution config "openshift" and faced the following issue:

2021-05-05 14:24:45,635 [INFO] Starting ceberus
2021-05-05 14:24:45,680 [INFO] Initializing client to talk to the Kubernetes cluster
2021-05-05 14:24:46,831 [INFO] Fetching cluster info
error: the server doesn't have a resource type "clusterversion"
2021-05-05 14:24:49,757 [ERROR] Failed to run kubectl get clusterversion
               _                         
  ___ ___ _ __| |__   ___ _ __ _   _ ___ 
 / __/ _ \ '__| '_ \ / _ \ '__| | | / __|
| (_|  __/ |  | |_) |  __/ |  | |_| \__ \
 \___\___|_|  |_.__/ \___|_|   \__,_|___/

Traceback (most recent call last):
  File "start_cerberus.py", line 468, in <module>
    main(options.cfg)
  File "start_cerberus.py", line 106, in main
    cluster_version = runcommand.invoke("kubectl get clusterversion")
  File "/root/cerberus/cerberus/invoke/command.py", line 12, in invoke
    return output
UnboundLocalError: local variable 'output' referenced before assignment

Could you please clarify for what purpose start_cerberus.py tries to get some resource with the name clusterversion using kubectl? https://github.com/cloud-bulldozer/cerberus/blob/eb449aae83f9b331d76c4413d0b6f8ab020e0ee7/start_cerberus.py#L107-L109

Unfortunately, I cannot find any details about the command kubectl get clusterversion and the resource clusterversion.

As a temporary solution, I just switched off this if statement.

chaitanyaenr commented 3 years ago

Hi @ratsuf, thanks for reporting the issue. Think the older kubectl versions doesn't support getting the details of the resource clusterversion. We might want to update the kubectl version and see if it fixes it. On the other hand, I think we can switch it to use oc version since this operation is specific to openshift distribution.

ratsuf commented 3 years ago

Hi @chaitanyaenr! Thank you for the reply. Switching to oc version makes more sense in the context of openshift distribution from my point of view.

chaitanyaenr commented 3 years ago

@ratsuf Do you want to submit a patch for it?

seanogor commented 3 years ago

as above, tested commands on a mac https://github.com/cloud-bulldozer/cerberus/pull/135

ratsuf commented 3 years ago

Dear @chaitanyaenr and @seanogor, I'm afraid 'clusterversion' not present in OKD 3.11 and lower. Please considerate #137 as an extended version of #135 to fix the issue.