knative / eventing

Event-driven application platform for Kubernetes
https://knative.dev/docs/eventing
Apache License 2.0
1.41k stars 588 forks source link

[BUG] Broker pods die if tracing configmap is malformed on start #1417

Closed grantr closed 3 years ago

grantr commented 5 years ago

Describe the bug If the tracing configmap is updated to include the key enabled: "true" but no zipkin-endpoint key, the Broker pods log an error but keep operating. They will fail to start upon the next restart.

Expected behavior The pod should use the default value, or disable tracing and continue. Tracing is an observability feature and its misconfiguration shouldn't cause the Broker to stop working. The fact that the bug is triggered by a restart means the failure will be difficult to associate with the root cause.

To Reproduce

  1. Create a config-tracing configmap in the Broker's namespace with the key enabled: "true".
  2. The Broker pods log an error but proceed.
  3. Restart one or both of the Broker pods.
  4. The Broker pods fail to start.

    Knative release version Eventing HEAD, Serving 0.6

    Additional context

akashrv commented 5 years ago

/priority important-soon

akashrv commented 5 years ago

/milestone v0.8.0

daisy-ycguo commented 5 years ago

in-memory-channel-dispatcher is using the same logic to manage tracing and it has the same problem. Without zipkin-endpoint configured, in-memory-channel-dispatcher pod will get error when starting up.

$ kubectl get pods -n knative-eventing
NAME                                            READY     STATUS    RESTARTS   AGE
eventing-controller-bd99fd4c6-h7hbd             1/1       Running   0          6h38m
eventing-webhook-589b7cbf6-m5f9q                1/1       Running   0          6h38m
in-memory-channel-controller-67fb687b6d-5q9b2   1/1       Running   0          30s
in-memory-channel-dispatcher-755fc5dd9c-h546d   0/1       Error     2          26s
sources-controller-ffc8988b-kb4wh               1/1       Running   0          6h38m

So the logic to manage tracing with Zipkin shall be reviewed and updated.

daisy-ycguo commented 5 years ago

Several pods with dependency on Zipkin are affected by this issue, not only broker and in-memory-channel-dispatcher. They are:

The root cause is vendor/knative.dev/pkg/tracing/config/trace.go: NewTracingConfigFromMap returning nil when tracing configmap is malformed. The activator pod in serving is affected by this issue too.

So we'd better create an issue in serving and fix it there.

akashrv commented 5 years ago

/milestone v0.9.0

Harwayne commented 5 years ago

/project Observability To do

grantr commented 5 years ago

/milestone v0.10.0

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.