Closed pacoxu closed 3 months ago
Failed to validate custom plugin config
{
Plugin:custom
PluginGlobalConfig:{
InvokeIntervalString:0xc0004711c0
TimeoutString:0xc0004711d0
InvokeInterval:5m0s Timeout:1m0s MaxOutputLength:0xc00056f4c0 Concurrency:0xc00056f4d0 EnableMessageChangeBasedConditionUpdate:0x2d0a80e SkipInitialStatus:0x2d0a80f
}
Source:kernel-monitor
DefaultConditions:[
{
Type:FrequentUnregisterNetDevice Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoFrequentUnregisterNetDevice Message:node is functioning properly
}
]
Rules:[0xc0005ece00]
EnableMetricsReporting:0xc00056f4d8
}:
rule path "/home/kubernetes/bin/log-counter" does not exist.
Rule: &{
Type:permanent
Condition:FrequentUnregisterNetDevice
Reason:UnregisterNetDevice
Path:/home/kubernetes/bin/log-counter
Args:[
--journald-source=kernel
--log-path=/var/log/journal
--lookback=20m
--count=3
--pattern=
unregister_netdevice: waiting for \w+ to become free. Usage count = \d+
]
TimeoutString:0xc0004711f0 Timeout:1m0s}
1.8.15 lost the bin log-counter
after https://github.com/kubernetes/node-problem-detector/pull/801 @hakman @vteratipally
➜ ~ docker run -it --rm --entrypoint=ls registry.k8s.io/node-problem-detector/node-problem-detector:v0.8.15 /home/kubernetes/bin/
health-checker
➜ ~ docker run -it --rm --entrypoint=ls registry.k8s.io/node-problem-detector/node-problem-detector:v0.8.13 /home/kubernetes/bin/
health-checker log-counter
Local run shows that the log-counter is not built due to no journald
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build \
-o bin/node-problem-detector \
-ldflags '-X k8s.io/node-problem-detector/pkg/version.version=v0.8.15-20-gd1166d34' \
-tags "" \
./cmd/nodeproblemdetector
echo "Warning: log-counter requires journald, skipping."
Warning: log-counter requires journald, skipping.
My guess is that it happens because of the way CloudBuild runs: https://github.com/kubernetes/node-problem-detector/blob/d1166d3495cb5bf8cc340dc7ee6a3aff3f1452c1/cloudbuild.yaml#L21
This seems to be intended for https://github.com/kubernetes/test-infra/issues/23202#issuecomment-1060219883?
/cc @SergeyKanzhelev
@hakman do you have some proposals to fix this?
log-counter
.@pacoxu @SergeyKanzhelev Let's give https://github.com/kubernetes/node-problem-detector/pull/867 a try.
@pacoxu could you give gcr.io/k8s-staging-npd/node-problem-detector:master a try? If all ok, we could do a release.
It looks like the issue was resolved in https://github.com/kubernetes/kubernetes/pull/123114.
/close
@wangzhen127: Closing this issue.
with 1.8.15
https://github.com/kubernetes/kubernetes/pull/123114#issuecomment-1963319709