kubernetes / node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes.
Apache License 2.0
2.9k stars 624 forks source link

Some included config files do not work #929

Closed fuero closed 1 month ago

fuero commented 1 month ago

Several of the included config files can't be loaded:

I0718 09:43:40.456284       1 log_monitor.go:78] Finish parsing log monitor config file /config/kernel-monitor.json: {WatcherConfig:{Plugin:kmsg PluginConfig:map[] LogPath:/dev/kmsg Lookback:5m Delay:} BufferSiz
e:10 Source:kernel-monitor DefaultConditions:[{Type:KernelDeadlock Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status: Tra
nsition:0001-01-01 00:00:00 +0000 UTC Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}] Rules:[{Type:temporary Condition: Reason:OOMKilling Pattern:Killed process \d+ (.+) total-vm:\d+kB, anon
-rss:\d+kB, file-rss:\d+kB.*} {Type:temporary Condition: Reason:TaskHung Pattern:task [\S ]+:\w+ blocked for more than \w+ seconds\.} {Type:temporary Condition: Reason:UnregisterNetDevice Pattern:unregister_netd
evice: waiting for \w+ to become free. Usage count = \d+} {Type:temporary Condition: Reason:KernelOops Pattern:BUG: unable to handle kernel NULL pointer dereference at .*} {Type:temporary Condition: Reason:Kerne
lOops Pattern:divide error: 0000 \[#\d+\] SMP} {Type:temporary Condition: Reason:Ext4Error Pattern:EXT4-fs error .*} {Type:temporary Condition: Reason:Ext4Warning Pattern:EXT4-fs warning .*} {Type:temporary Cond
ition: Reason:IOError Pattern:Buffer I/O error .*} {Type:temporary Condition: Reason:MemoryReadError Pattern:CE memory read error .*} {Type:permanent Condition:KernelDeadlock Reason:DockerHung Pattern:task docke
r:\w+ blocked for more than \w+ seconds\.} {Type:permanent Condition:ReadonlyFilesystem Reason:FilesystemIsReadOnly Pattern:Remounting filesystem read-only}] EnableMetricsReporting:0xc000712e1c}                 
I0718 09:43:40.456367       1 log_watchers.go:40] Use log watcher of plugin "kmsg"                                                                                                                                 
F0718 09:43:40.456460       1 log_monitor.go:70] Failed to unmarshal configuration file "/config/health-checker-kubelet.json": json: cannot unmarshal number into Go struct field MonitorConfig.pluginConfig of typ
e string 
fuero commented 1 month ago

Nevermind, I loaded them all via log_monitors in the helm chart which was incorrect.