VictoriaMetrics / operator

Kubernetes operator for Victoria Metrics
Apache License 2.0
410 stars 141 forks source link

Logging spam after upgrading to helm 0.29.0 and operator version 0.42.0 #892

Closed bh-tt closed 4 months ago

bh-tt commented 4 months ago

Last night we upgraded to the new operator version (from 0.41.2), and today we woke up to approximately 70 million log messages from vm operator. Something appears to go wrong with a watch on prometheus custom resources, as the log is filled with the following 2 messages:

{"level":"error","ts":"2024-03-05T08:26:50Z","msg":"k8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:231: expected type *v1.ServiceMonitor, but watch event object had type <nil>","stacktrace":"k8s.io/klog/v2.(*loggingT).output\n\tk8s.io/klog/v2@v2.110.1/klog.go:895\nk8s.io/klog/v2.(*loggingT).printWithInfos\n\tk8s.io/klog/v2@v2.110.1/klog.go:723\nk8s.io/klog/v2.(*loggingT).printDepth\n\tk8s.io/klog/v2@v2.110.1/klog.go:705\nk8s.io/klog/v2.ErrorDepth\n\tk8s.io/klog/v2@v2.110.1/klog.go:1574\nk8s.io/apimachinery/pkg/util/runtime.logError\n\tk8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:115\nk8s.io/apimachinery/pkg/util/runtime.HandleError\n\tk8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:109\nk8s.io/client-go/tools/cache.watchHandler\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:726\nk8s.io/client-go/tools/cache.(*Reflector).watch\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:431\nk8s.io/client-go/tools/cache.(*Reflector).ListAndWatch\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:356\nk8s.io/client-go/tools/cache.(*Reflector).Run.func1\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:289\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:226\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:227\nk8s.io/client-go/tools/cache.(*Reflector).Run\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:288\nk8s.io/client-go/tools/cache.(*controller).Run.(*Group).StartWithChannel.func2\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:55\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:72"}

and occasionally we also have:

{"level":"error","ts":"2024-03-05T08:26:50Z","msg":"k8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:231: expected type *v1.Probe, but watch event object had type <nil>","stacktrace":"k8s.io/klog/v2.(*loggingT).output\n\tk8s.io
/klog/v2@v2.110.1/klog.go:895\nk8s.io/klog/v2.(*loggingT).printWithInfos\n\tk8s.io/klog/v2@v2.110.1/klog.go:723\nk8s.io/klog/v2.(*loggingT).printDepth\n\tk8s.io/klog/v2@v2.110.1/klog.go:705\nk8s.io/klog/v2.ErrorDepth\n\tk8s.io/klog/v2@v2.110
.1/klog.go:1574\nk8s.io/apimachinery/pkg/util/runtime.logError\n\tk8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:115\nk8s.io/apimachinery/pkg/util/runtime.HandleError\n\tk8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:109\nk
8s.io/client-go/tools/cache.watchHandler\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:726\nk8s.io/client-go/tools/cache.(*Reflector).watch\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:431\nk8s.io/cli
ent-go/tools/cache.(*Reflector).ListAndWatch\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:356\nk8s.io/client-go/tools/cache.(*Reflector).Run.func1\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:289\nk8
s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:226\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/backoff.go:227\nk8s.io/client-go/tool
s/cache.(*Reflector).Run\n\tk8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:288\nk8s.io/client-go/tools/cache.(*controller).Run.(*Group).StartWithChannel.func2\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:55\nk8s.io/ap
imachinery/pkg/util/wait.(*Group).Start.func1\n\tk8s.io/apimachinery@v0.29.0/pkg/util/wait/wait.go:72"}

For now we have rollbacked to 0.41.2.

f41gh7 commented 4 months ago

Thanks for reporting, we're going to fix it soon.

f41gh7 commented 4 months ago

Must be fixed at v0.42.2 release

bh-tt commented 4 months ago

It is, tested it this morning.