supereagle / experiences

Summary of practical experience in work.
2 stars 0 forks source link

Monitoring Kubernetes components health #53

Closed supereagle closed 5 years ago

supereagle commented 6 years ago

Kubernetes provides health check APIs to monitor its components' status.

Health check API list:

        "/healthz",
        "/healthz/autoregister-completion",
        "/healthz/etcd",
        "/healthz/ping",
        "/healthz/poststarthook/apiservice-openapi-controller",
        "/healthz/poststarthook/apiservice-registration-controller",
        "/healthz/poststarthook/apiservice-status-available-controller",
        "/healthz/poststarthook/bootstrap-controller",
        "/healthz/poststarthook/ca-registration",
        "/healthz/poststarthook/generic-apiserver-start-informers",
        "/healthz/poststarthook/kube-apiserver-autoregistration",
        "/healthz/poststarthook/rbac/bootstrap-roles",
        "/healthz/poststarthook/start-apiextensions-controllers",
        "/healthz/poststarthook/start-apiextensions-informers",
        "/healthz/poststarthook/start-kube-aggregator-informers",
        "/healthz/poststarthook/start-kube-apiserver-informers",
        "/api/v1/componentstatuses"
GET /healthz

ok
GET /healthz/etcd

ok
GET /api/v1/componentstatuses

{
    "kind": "ComponentStatusList",
    "apiVersion": "v1",
    "metadata": {
        "selfLink": "/api/v1/componentstatuses"
    },
    "items": [
        {
            "metadata": {
                "name": "controller-manager",
                "selfLink": "/api/v1/componentstatuses/controller-manager",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "ok"
                }
            ]
        },
        {
            "metadata": {
                "name": "scheduler",
                "selfLink": "/api/v1/componentstatuses/scheduler",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "ok"
                }
            ]
        },
        {
            "metadata": {
                "name": "etcd-2",
                "selfLink": "/api/v1/componentstatuses/etcd-2",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "{\"health\": \"true\"}"
                }
            ]
        },
        {
            "metadata": {
                "name": "etcd-0",
                "selfLink": "/api/v1/componentstatuses/etcd-0",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "{\"health\": \"true\"}"
                }
            ]
        },
        {
            "metadata": {
                "name": "etcd-1",
                "selfLink": "/api/v1/componentstatuses/etcd-1",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "{\"health\": \"true\"}"
                }
            ]
        },
        {
            "metadata": {
                "name": "etcd-3",
                "selfLink": "/api/v1/componentstatuses/etcd-3",
                "creationTimestamp": null
            },
            "conditions": [
                {
                    "type": "Healthy",
                    "status": "True",
                    "message": "{\"health\": \"true\"}"
                }
            ]
        }
    ]
}
supereagle commented 6 years ago

cs has one limitation: all components needs to run on the same host with apiserver. Refer to componentstatus fails when components are not running on the same host as apiserver.

It will be deprecated in the future, but still in proposal stage: Deprecate ComponentStatus.

supereagle commented 5 years ago

There is a bug kubernetes/kubernetes#72682 for kube-proxy health check, please check the k8s version.

supereagle commented 5 years ago

There are 2 open source project with different solutions for K8s cluster healthy:

Both of these 2 methods are used in our practice. I have written a blog Kubernetes Cluster Health Monitoring to summary and compare these 2 solutions.