kubewharf / kelemetry

Global control plane tracing for Kubernetes
Apache License 2.0
249 stars 29 forks source link

kelemetry-consumer and kelemetry-kelemetry-frontend CrashLoopBackOff #126

Closed calvinxu closed 1 year ago

calvinxu commented 1 year ago

Steps to reproduce

  1. check out 0.2.1 tag from kelemetry repo
  2. helm install kelemetry oci://ghcr.io/kubewharf/kelemetry-chart --values values.yaml

Expected behavior

kelemetry component up and run as expected

Actual behavior

kelemetry-collector-5ccb7986df-58zdz 1/1 Running 2 (17m ago) 17m kelemetry-collector-5ccb7986df-5cj8k 1/1 Running 2 (17m ago) 17m kelemetry-collector-5ccb7986df-vxchw 1/1 Running 2 (17m ago) 17m kelemetry-consumer-bdb696b46-dpp6c 0/1 CrashLoopBackOff 7 (2m29s ago) 17m kelemetry-consumer-bdb696b46-f6bht 0/1 CrashLoopBackOff 7 (2m19s ago) 17m kelemetry-consumer-bdb696b46-xz9xm 0/1 CrashLoopBackOff 7 (2m54s ago) 17m kelemetry-etcd-0 1/1 Running 0 17m kelemetry-etcd-1 1/1 Running 0 17m kelemetry-etcd-2 1/1 Running 0 17m kelemetry-frontend-5cb6746769-4tc4v 0/2 CrashLoopBackOff 16 (47s ago) 17m kelemetry-frontend-5cb6746769-kl6xx 0/2 CrashLoopBackOff 17 (38s ago) 17m kelemetry-frontend-5cb6746769-qhjwd 0/2 CrashLoopBackOff 16 (50s ago) 17m kelemetry-informers-866df8f86d-2nh7x 1/1 Running 0 17m kelemetry-informers-866df8f86d-4vlvk 1/1 Running 0 17m kelemetry-informers-866df8f86d-gwml2 1/1 Running 0 17m kelemetry-storage-0 1/1 Running 0 17m

  1. kubectl logs kelemetry-frontend-5cb6746769-4tc4v -c storage-plugin

    ...

time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=jaeger-backend time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/replace-name-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/prune-tags-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/compact-duration-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/extract-nesting-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/service-operation-replace-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/cluster-name-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/group-by-trace-source-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tf-step/object-tags-visitor time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=tfconfig.RegisteredStep-list time="2023-07-19T14:47:48Z" level=info msg=Initializing mod=jaeger-transform-config/file time="2023-07-19T14:47:53Z" level=error msg="error initializing \"jaeger-transform-config/file\": parse tfconfig modifier error: invalid modifier args: parse extension provider config error: cannot initialize extension storage: grpc-plugin builder failed to create a store: error connecting to remote storage: context deadline exceeded"

2.#kubectl logs kelemetry-consumer-bdb696b46-f6bht time="2023-07-19T14:46:18Z" level=info msg=Starting mod=pprof time="2023-07-19T14:46:18Z" level=info msg="Startup complete" panic: metric "diff_decorator_retry_count" was not initialized panic: metric "diff_decorator" was not initialized

goroutine 84 [running]: github.com/kubewharf/kelemetry/pkg/metrics.(Metric[...]).With(0x4069d9?, 0x4ff15a?) /src/pkg/metrics/interface.go:158 +0xf4 github.com/kubewharf/kelemetry/pkg/metrics.(Metric[...]).DeferCount(0xc00071c2f8?, {0x51e68e?, 0x2bde0d0?, 0x3fd9a20?}, 0x2bc0320?) /src/pkg/metrics/interface.go:169 +0x49 panic({0x226f460, 0xc005d20970}) /usr/local/go/src/runtime/panic.go:884 +0x213 github.com/kubewharf/kelemetry/pkg/metrics.(Metric[...]).With(0xc005d28390?, 0x3fd9a20?) /src/pkg/metrics/interface.go:158 +0xf4 github.com/kubewharf/kelemetry/pkg/diff/decorator.(decorator).tryDecorate(0xc0002e20e0, {0x2bde0d0, 0xc000338730}, {0x2bfccf0, 0xc005d193b0}, 0xc005d10b00, 0xc005cba840) /src/pkg/diff/decorator/decorator.go:282 +0x1325 github.com/kubewharf/kelemetry/pkg/diff/decorator.(decorator).Decorate(0xc0002e20e0, {0x2bde0d0, 0xc000338730}, 0xc005d10b00, 0xc005cba840) /src/pkg/diff/decorator/decorator.go:144 +0x465 github.com/kubewharf/kelemetry/pkg/audit/consumer.(receiver).handleItem(0xc0001ac0e0, {0x2bde0d0, 0xc000338730}, {0x2bfccf0?, 0xc005d18fc0?}, 0xc005d10b00, 0xc005c71da0) /src/pkg/audit/consumer/consumer.go:278 +0x12d3 github.com/kubewharf/kelemetry/pkg/audit/consumer.(receiver).handleMessage(0xc0001ac0e0, {0x2bde0d0, 0xc000338730}, {0x2bfccf0?, 0xc005d18ee0?}, {0xc005d9fbc0, 0x33, 0x0?}, {0xc005dbac80, 0xc58, ...}, ...) /src/pkg/audit/consumer/consumer.go:175 +0x4de github.com/kubewharf/kelemetry/pkg/audit/consumer.(receiver).Init.func1({0x2bde0d0?, 0xc000338730?}, {0x2bfccf0?, 0xc005d18ee0?}, {0xc005d9fbc0?, 0xd40010?, 0xc0005c2f00?}, {0xc005dbac80, 0xc58, 0xc80}) /src/pkg/audit/consumer/consumer.go:130 +0x9d github.com/kubewharf/kelemetry/pkg/audit/mq/local.(localConsumer).start.func1() /src/pkg/audit/mq/local/local.go:203 +0x14c created by github.com/kubewharf/kelemetry/pkg/audit/mq/local.(localConsumer).start /src/pkg/audit/mq/local/local.go:193 +0x95

Kelemetry version

0.2.1

Environment

k8s:1.23.17 Jaeger:1.4.2

SOF3 commented 1 year ago

It is caused by the apiserver extension trace configured in the new tfconfig.toml. I removed it in v0.2.2 (015198e) since most users might not find it useful anyway.

SOF3 commented 1 year ago

Following up the consumer panic issue in #127