Closed Subhankar-Adak closed 1 month ago
Hi @Subhankar-Adak could you provide the status of pods in knative-serving ns. It seems from your logs that activator cannot connect to the autoscaler pod and fails. Could you provide the logs of the autoscaler pod too?
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
What version of Knative?
V1.11.0
0.11.x
As part of the Kserve deployment, we are deploying Istio, Cert Manager, and Knative as dependencies. Intermittently, we are facing an issue in the Knative deployment step where the Knative activator pod is not running properly and goes into crashloop backoff regularly. Other pods in the Knative namespace are running properly.
Versions of Dependencies:
Environment Details:
Activator Pod description Log:
Knative activator:
Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: kube-api-access-8nkbz: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Normal Scheduled 12m default-scheduler Successfully assigned knative-serving/activator-59dff6d45c-wqt8w to v16regressionnode00002 Normal Pulled 12m kubelet Container image "gcr.io/knative-releases/knative.dev/serving/cmd/activator@sha256:6b98eed95dd6dcc3d957e673aea3d271b768225442504316d713c08524f44ebe" already present on machine Normal Created 12m kubelet Created container activator Normal Started 12m kubelet Started container activator Warning Unhealthy 11m (x5 over 12m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500 Warning Unhealthy 2m20s (x137 over 12m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Activator pod logs:
[root@v16regressionnode00003 ~]# kubectl logs activator-7bcc758ddd-wk7cd -n knative-serving 2024/04/25 11:22:05 Registering 2 clients 2024/04/25 11:22:05 Registering 3 informer factories 2024/04/25 11:22:05 Registering 4 informers {"severity":"INFO","timestamp":"2024-04-25T11:22:05.716400581Z","logger":"activator","caller":"activator/main.go:140","message":"Starting the knative activator","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"INFO","timestamp":"2024-04-25T11:22:05.718542578Z","logger":"activator","caller":"activator/main.go:200","message":"Connecting to Autoscaler at ws://autoscaler.knative-serving.svc.cluster.local:8080","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"INFO","timestamp":"2024-04-25T11:22:05.718768882Z","logger":"activator","caller":"websocket/connection.go:161","message":"Connecting to ws://autoscaler.knative-serving.svc.cluster.local:8080","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"INFO","timestamp":"2024-04-25T11:22:05.719123778Z","logger":"activator","caller":"profiling/server.go:65","message":"Profiling enabled: false","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"INFO","timestamp":"2024-04-25T11:22:05.7237912Z","logger":"activator","caller":"activator/request_log.go:45","message":"Updated the request log template.","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd","template":""} {"severity":"WARNING","timestamp":"2024-04-25T11:22:06.685484891Z","logger":"activator","caller":"handler/healthz_handler.go:36","message":"Healthcheck failed: connection has not yet been established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"WARNING","timestamp":"2024-04-25T11:22:07.686801714Z","logger":"activator","caller":"handler/healthz_handler.go:36","message":"Healthcheck failed: connection has not yet been established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd"} {"severity":"ERROR","timestamp":"2024-04-25T11:22:08.719008181Z","logger":"activator","caller":"websocket/connection.go:144","message":"Websocket connection could not be established","commit":"f1617ef","knative.dev/controller":"activator","knative.dev/pod":"activator-7bcc758ddd-wk7cd","error":"dial tcp: lookup autoscaler.knative-serving.svc.cluster.local: i/o timeout","stacktrace":"knative.dev/pkg/websocket.NewDurableConnection.func1\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:144\nknative.dev/pkg/websocket.(ManagedConnection).connect.func1\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:225\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:222\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:235\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:228\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\tk8s.io/apimachinery@v0.26.5/pkg/util/wait/wait.go:423\nknative.dev/pkg/websocket.(ManagedConnection).connect\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:222\nknative.dev/pkg/websocket.NewDurableConnection.func2\n\tknative.dev/pkg@v0.0.0-20230718152110-aef227e72ead/websocket/connection.go:162"}
Actual Behavior
Steps to Reproduce the Problem