runwhen-contrib / rw-public-codecollection

RunWhen Public Codecollection Repository - Open Source troubleshooting runbook library for Kubernetes and cloud infrastructure components.
https://registry.runwhen.com
Apache License 2.0
40 stars 5 forks source link

namespace healthcheck via kubectl #66

Closed stewartshea closed 1 year ago

stewartshea commented 1 year ago

As part of https://github.com/runwhen-contrib/rw-public-codecollection/issues/53, this update both changes the path of the codebundle to be more explicit that it uses kubectl, and provides an entirely new SLI that is configurable in terms of supporting thresholds, time windows, and multiple namespaces - all into an aggregate health score.

What's not done?

I highly anticipate some requested changes or questions :)

stewartshea commented 1 year ago

Ahh, thanks for all the feedback and the ordering is a very helpful point ... I'll address these tomorrow.

-- Shea Stewart (he/him) c:647-972-5191 @.*** Book a meeting https://calendly.com/stewartshea

On Tue, Feb 14, 2023, 12:58 p.m. jon-funk @.***> wrote:

@.**** commented on this pull request.

In codebundles/k8s-kubectl-namespace-healthcheck/sli.robot https://github.com/runwhen-contrib/rw-public-codecollection/pull/66#discussion_r1106169067 :

  • Log ${container_restart_count} total container restarts found in the last ${CONTAINER_RESTART_AGE}
  • ${container_restart_score}= Evaluate 1 if ${container_restart_count} <= ${CONTAINER_RESTART_THRESHOLD} else 0
  • Set Global Variable ${container_restart_score}
  • +Get NotReady Pods

  • ${pods_notready_count}= RW.K8s.Count Notready Pods
  • ... namespace=${NAMESPACE}
  • ... context=${CONTEXT}
  • ... kubeconfig=${kubeconfig}
  • ... target_service=${kubectl}
  • ... binary_name=${binary_name}
  • Log ${pods_notready_count} total unready pods
  • ${pods_notready_score}= Evaluate 1 if ${pods_notready_count} == 0 else 0
  • Set Global Variable ${pods_notready_score}
  • +Generate Namspace Score

To clarify - you probably need to mash all these together into 1 task

— Reply to this email directly, view it on GitHub https://github.com/runwhen-contrib/rw-public-codecollection/pull/66#discussion_r1106169067, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELXHODZZVOYNNFGVBRN5MTWXPBT7ANCNFSM6AAAAAAU3YWD4I . You are receiving this because you authored the thread.Message ID: @.*** .com>