kubernetes / node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes.
Apache License 2.0
2.83k stars 615 forks source link

node-problem-detector

Build Status Go Report Card

node-problem-detector aims to make various node problems visible to the upstream layers in the cluster management stack. It is a daemon that runs on each node, detects node problems and reports them to apiserver. node-problem-detector can either run as a DaemonSet or run standalone. Now it is running as a Kubernetes Addon enabled by default in the GKE cluster. It is also enabled by default in AKS as part of the AKS Linux Extension.

Background

There are tons of node problems that could possibly affect the pods running on the node, such as:

Currently, these problems are invisible to the upstream layers in the cluster management stack, so Kubernetes will continue scheduling pods to the bad nodes.

To solve this problem, we introduced this new daemon node-problem-detector to collect node problems from various daemons and make them visible to the upstream layers. Once upstream layers have visibility to those problems, we can discuss the remedy system.

Problem API

node-problem-detector uses Event and NodeCondition to report problems to apiserver.

Problem Daemon

A problem daemon is a sub-daemon of node-problem-detector. It monitors specific kinds of node problems and reports them to node-problem-detector.

A problem daemon could be:

Currently, a problem daemon is running as a goroutine in the node-problem-detector binary. In the future, we'll separate node-problem-detector and problem daemons into different containers, and compose them with pod specification.

Each category of problem daemon can be disabled at compilation time by setting corresponding build tags. If they are disabled at compilation time, then all their build dependencies, global variables and background goroutines will be trimmed out of the compiled executable.

List of supported problem daemons types:

Problem Daemon Types NodeCondition Description Configs Disabling Build Tag
SystemLogMonitor KernelDeadlock ReadonlyFilesystem FrequentKubeletRestart FrequentDockerRestart FrequentContainerdRestart A system log monitor monitors system log and reports problems and metrics according to predefined rules. filelog, kmsg, kernel abrt systemd disable_system_log_monitor
SystemStatsMonitor None(Could be added in the future) A system stats monitor for node-problem-detector to collect various health-related system stats as metrics. See the proposal here. system-stats-monitor disable_system_stats_monitor
CustomPluginMonitor On-demand(According to users configuration), existing example: NTPProblem A custom plugin monitor for node-problem-detector to invoke and check various node problems with user-defined check scripts. See the proposal here. example disable_custom_plugin_monitor
HealthChecker KubeletUnhealthy ContainerRuntimeUnhealthy A health checker for node-problem-detector to check kubelet and container runtime health. kubelet docker containerd

Exporter

An exporter is a component of node-problem-detector. It reports node problems and/or metrics to certain backends. Some of them can be disabled at compile-time using a build tag. List of supported exporters:

Exporter Description Disabling Build Tag
Kubernetes exporter Kubernetes exporter reports node problems to Kubernetes API server: temporary problems get reported as Events, and permanent problems get reported as Node Conditions.
Prometheus exporter Prometheus exporter reports node problems and metrics locally as Prometheus metrics
Stackdriver exporter Stackdriver exporter reports node problems and metrics to Stackdriver Monitoring API. disable_stackdriver_exporter

Usage

Flags

For System Log Monitor

For System Stats Monitor

For Custom Plugin Monitor

For Health Checkers

Health checkers are configured as custom plugins, using the config/health-checker-*.json config files.

For Kubernetes exporter

For Prometheus exporter

For Stackdriver exporter

Deprecated Flags

Build Image

If you do not need certain categories of problem daemons, you could choose to disable them at compilation time. This is the best way of keeping your node-problem-detector runtime compact without unnecessary code (e.g. global variables, goroutines, etc). You can do so via setting the BUILD_TAGS environment variable before running make. For example:

BUILD_TAGS="disable_custom_plugin_monitor disable_system_stats_monitor" make

The above command will compile the node-problem-detector without Custom Plugin Monitor and System Stats Monitor. Check out the Problem Daemon section to see how to disable each problem daemon during compilation time.

Push Image

make push uploads the docker image to a registry. By default, the image will be uploaded to staging-k8s.gcr.io. It's easy to modify the Makefile to push the image to another registry.

Installation

The easiest way to install node-problem-detector into your cluster is to use the Helm chart:

helm repo add deliveryhero https://charts.deliveryhero.io/
helm install --generate-name deliveryhero/node-problem-detector

Alternatively, to install node-problem-detector manually:

  1. Edit node-problem-detector.yaml to fit your environment. Set log volume to your system log directory (used by SystemLogMonitor). You can use a ConfigMap to overwrite the config directory inside the pod.

  2. Edit node-problem-detector-config.yaml to configure node-problem-detector.

  3. Edit rbac.yaml to fit your environment.

  4. Create the ServiceAccount and ClusterRoleBinding with kubectl create -f rbac.yaml.

  5. Create the ConfigMap with kubectl create -f node-problem-detector-config.yaml.

  6. Create the DaemonSet with kubectl create -f node-problem-detector.yaml.

Start Standalone

To run node-problem-detector standalone, you should set inClusterConfig to false and teach node-problem-detector how to access apiserver with apiserver-override.

To run node-problem-detector standalone with an insecure apiserver connection:

node-problem-detector --apiserver-override=http://APISERVER_IP:APISERVER_INSECURE_PORT?inClusterConfig=false

For more scenarios, see here

Windows

Node Problem Detector has preliminary support Windows. Most of the functionality has not been tested but filelog plugin works.

Follow Issue #461 for development status of Windows support.

Development

To develop NPD on Windows you'll need to setup your Windows machine for Go development. Install the following tools:

# Run these commands in the node-problem-detector directory.

# Build in MINGW64 Window
make clean ENABLE_JOURNALD=0 build-binaries

# Test in MINGW64 Window
make test

# Run with containerd log monitoring enabled in Command Prompt. (Assumes containerd is installed.)
%CD%\output\windows_amd64\bin\node-problem-detector.exe --logtostderr --enable-k8s-exporter=false --config.system-log-monitor=%CD%\config\windows-containerd-monitor-filelog.json --config.system-stats-monitor=config\windows-system-stats-monitor.json

# Configure NPD to run as a Windows Service
sc.exe create NodeProblemDetector binpath= "%CD%\node-problem-detector.exe [FLAGS]" start= demand
sc.exe failure NodeProblemDetector reset= 0 actions= restart/10000
sc.exe start NodeProblemDetector

Try It Out

You can try node-problem-detector in a running cluster by injecting messages to the logs that node-problem-detector is watching. For example, Let's assume node-problem-detector is using KernelMonitor. On your workstation, run kubectl get events -w. On the node, run sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg". Then you should see the KernelOops event.

When adding new rules or developing node-problem-detector, it is probably easier to test it on the local workstation in the standalone mode. For the API server, an easy way is to use kubectl proxy to make a running cluster's API server available locally. You will get some errors because your local workstation is not recognized by the API server. But you should still be able to test your new rules regardless.

For example, to test KernelMonitor rules:

  1. make (build node-problem-detector locally)
  2. kubectl proxy --port=8080 (make a running cluster's API server available locally)
  3. Update KernelMonitor's logPath to your local kernel log directory. For example, on some Linux systems, it is /run/log/journal instead of /var/log/journal.
  4. ./bin/node-problem-detector --logtostderr --apiserver-override=http://127.0.0.1:8080?inClusterConfig=false --config.system-log-monitor=config/kernel-monitor.json --config.system-stats-monitor=config/system-stats-monitor.json --port=20256 --prometheus-port=20257 (or point to any API server address:port and Prometheus port)
  5. sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg"
  6. You can see KernelOops event in the node-problem-detector log.
  7. sudo sh -c "echo 'kernel: INFO: task docker:20744 blocked for more than 120 seconds.' >> /dev/kmsg"
  8. You can see DockerHung event and condition in the node-problem-detector log.
  9. You can see DockerHung condition at http://127.0.0.1:20256/conditions.
  10. You can see disk-related system metrics in Prometheus format at http://127.0.0.1:20257/metrics.

Note:

Dependency Management

node-problem-detector uses go modules to manage dependencies. Therefore, building node-problem-detector requires golang 1.11+. It still uses vendoring. See the Kubernetes go modules KEP for the design decisions. To add a new dependency, update go.mod and run go mod vendor.

Remedy Systems

A remedy system is a process or processes designed to attempt to remedy problems detected by the node-problem-detector. Remedy systems observe events and/or node conditions emitted by the node-problem-detector and take action to return the Kubernetes cluster to a healthy state. The following remedy systems exist:

Testing

NPD is tested via unit tests, NPD e2e tests, Kubernetes e2e tests and Kubernetes nodes e2e tests. Prow handles the pre-submit tests and CI tests.

CI test results can be found below:

  1. Unit tests
  2. NPD e2e tests
  3. Kubernetes e2e tests
  4. Kubernetes nodes e2e tests

Running tests

Unit tests are run via make test.

See NPD e2e test documentation for how to set up and run NPD e2e tests.

Problem Maker

Problem maker is a program used in NPD e2e tests to generate/simulate node problems. It is ONLY intended to be used by NPD e2e tests. Please do NOT run it on your workstation, as it could cause real node problems.

Compatibility

Node problem detector's architecture has been fairly stable. Recent versions (v0.8.13+) should be able to work with any supported kubernetes versions.

Docs

Links