Closed vaikas closed 3 years ago
Looking...
Here's the step doing the wait:
- name: Wait for things to be up
run: |
kubectl wait pod --for=condition=Ready -n ${SYSTEM_NAMESPACE} -l '!job-name'
https://github.com/knative/eventing/runs/1406746500?check_suite_focus=true
I1116 14:19:57.809327 27973 round_trippers.go:423] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a" 'https://127.0.0.1:36147/api/v1/namespaces/knative-eventing/pods?fieldSelector=metadata.name%3Deventing-webhook-6bd5798587-4zv5s&resourceVersion=2808&watch=true'
I1116 14:19:57.809985 27973 round_trippers.go:443] GET https://127.0.0.1:36147/api/v1/namespaces/knative-eventing/pods?fieldSelector=metadata.name%3Deventing-webhook-6bd5798587-4zv5s&resourceVersion=2808&watch=true 200 OK in 0 milliseconds
I1116 14:19:57.810003 27973 round_trippers.go:449] Response Headers:
I1116 14:19:57.810008 27973 round_trippers.go:452] Cache-Control: no-cache, private
I1116 14:19:57.810011 27973 round_trippers.go:452] Content-Type: application/json
I1116 14:19:57.810014 27973 round_trippers.go:452] Date: Mon, 16 Nov 2020 14:19:57 GMT
I1116 14:20:27.810546 27973 round_trippers.go:423] curl -k -v -XGET -H "Accept: application/json" -H "User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a" 'https://127.0.0.1:36147/api/v1/namespaces/knative-eventing/pods?fieldSelector=metadata.name%3Deventing-webhook-6bd5798587-k6mr8'
I1116 14:20:27.813242 27973 round_trippers.go:443] GET https://127.0.0.1:36147/api/v1/namespaces/knative-eventing/pods?fieldSelector=metadata.name%3Deventing-webhook-6bd5798587-k6mr8 200 OK in 2 milliseconds
I1116 14:20:27.813259 27973 round_trippers.go:449] Response Headers:
pod/eventing-webhook-6bd5798587-k6mr8 condition met
I1116 14:20:27.813263 27973 round_trippers.go:452] Cache-Control: no-cache, private
I1116 14:20:27.813267 27973 round_trippers.go:452] Content-Type: application/json
I1116 14:20:27.813270 27973 round_trippers.go:452] Date: Mon, 16 Nov 2020 14:20:27 GMT
I1116 14:20:27.813811 27973 request.go:1097] Response Body: {"kind":"PodList","apiVersion":"v1","metadata":{"selfLink":"/api/v1/namespaces/knative-eventing/pods","resourceVersion":"3232"},"items":[{"metadata":{"name":"eventing-webhook-6bd5798587-k6mr8","generateName":"eventing-webhook-6bd5798587-","namespace":"knative-eventing","selfLink":"/api/v1/namespaces/knative-eventing/pods/eventing-webhook-6bd5798587-k6mr8","uid":"015fa6f7-88e9-45e5-95b2-91b2da58ee13","resourceVersion":"2579","creationTimestamp":"2020-11-16T14:19:36Z","labels":{"app":"eventing-webhook","pod-template-hash":"6bd5798587","role":"eventing-webhook"},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"eventing-webhook-6bd5798587","uid":"59b46e0b-631b-4f3e-8dba-584e35bc6dea","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2020-11-16T14:19:36Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{},"f:role":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"59b46e0b-631b-4f3e-8dba-584e35bc6dea\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:affinity":{".":{},"f:podAntiAffinity":{".":{},"f:preferredDuringSchedulingIgnoredDuringExecution":{}}},"f:containers":{"k:{\"name\":\"eventing-webhook\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"CONFIG_LOGGING_NAME\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"METRICS_DOMAIN\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"POD_NAME\"}":{".":{},"f:name":{},"f:valueFrom":{".":{},"f:fieldRef":{".":{},"f:apiVersion":{},"f:fieldPath":{}}}},"k:{\"name\":\"SINK_BINDING_SELECTION_MODE\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"SYSTEM_NAMESPACE\"}":{".":{},"f:name":{},"f:valueFrom":{".":{},"f:fieldRef":{".":{},"f:apiVersion":{},"f:fieldPath":{}}}},"k:{\"name\":\"WEBHOOK_NAME\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"WEBHOOK_PORT\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:livenessProbe":{".":{},"f:failureThreshold":{},"f:httpGet":{".":{},"f:httpHeaders":{},"f:path":{},"f:port":{},"f:scheme":{}},"f:initialDelaySeconds":{},"f:periodSeconds":{},"f:successThreshold":{},"f:timeoutSeconds":{}},"f:name":{},"f:ports":{".":{},"k:{\"containerPort\":8008,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":8443,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":9090,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}}},"f:readinessProbe":{".":{},"f:failureThreshold":{},"f:httpGet":{".":{},"f:httpHeaders":{},"f:path":{},"f:port":{},"f:scheme":{}},"f:periodSeconds":{},"f:successThreshold":{},"f:timeoutSeconds":{}},"f:resources":{".":{},"f:limits":{".":{},"f:cpu":{},"f:memory":{}},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}},"f:securityContext":{".":{},"f:allowPrivilegeEscalation":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:serviceAccount":{},"f:serviceAccountName":{},"f:terminationGracePeriodSeconds":{}}}},{"manager":"kubelet","operation":"Update","apiVersion":"v1","time":"2020-11-16T14:19:42Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:phase":{},"f:podIP":{},"f:podIPs":{".":{},"k:{\"ip\":\"10.244.1.19\"}":{".":{},"f:ip":{}}},"f:startTime":{}}}}]},"spec":{"volumes":[{"name":"eventing-webhook-token-bqv7v","secret":{"secretName":"eventing-webhook-token-bqv7v","defaultMode":420}}],"containers":[{"name":"eventing-webhook","image":"kind.local/knative.dev/eventing/cmd/webhook:1af4fd82f9a9ff68e3f5768dda777cabfe0e349429cf8289bdc3f32b533b60a4","ports":[{"name":"https-webhook","containerPort":8443,"protocol":"TCP"},{"name":"metrics","containerPort":9090,"protocol":"TCP"},{"name":"profiling","containerPort":8008,"protocol":"TCP"}],"env":[{"name":"SYSTEM_NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"CONFIG_LOGGING_NAME","value":"config-logging"},{"name":"METRICS_DOMAIN","value":"knative.dev/eventing"},{"name":"WEBHOOK_NAME","value":"eventing-webhook"},{"name":"WEBHOOK_PORT","value":"8443"},{"name":"SINK_BINDING_SELECTION_MODE","value":"exclusion"},{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}}],"resources":{"limits":{"cpu":"200m","memory":"200Mi"},"requests":{"cpu":"20m","memory":"20Mi"}},"volumeMounts":[{"name":"eventing-webhook-token-bqv7v","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"livenessProbe":{"httpGet":{"path":"/","port":8443,"scheme":"HTTPS","httpHeaders":[{"name":"k-kubelet-probe","value":"webhook"}]},"initialDelaySeconds":20,"timeoutSeconds":1,"periodSeconds":1,"successThreshold":1,"failureThreshold":3},"readinessProbe":{"httpGet":{"path":"/","port":8443,"scheme":"HTTPS","httpHeaders":[{"name":"k-kubelet-probe","value":"webhook"}]},"timeoutSeconds":1,"periodSeconds":1,"successThreshold":1,"failureThreshold":3},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"FallbackToLogsOnError","imagePullPolicy":"IfNotPresent","securityContext":{"allowPrivilegeEscalation":false}}],"restartPolicy":"Always","terminationGracePeriodSeconds":300,"dnsPolicy":"ClusterFirst","serviceAccountName":"eventing-webhook","serviceAccount":"eventing-webhook","nodeName":"kind-worker","securityContext":{},"affinity":{"podAntiAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"weight":100,"podAffinityTerm":{"labelSelector":{"matchLabels":{"app":"eventing-webhook"}},"topologyKey":"kubernetes.io/hostname"}}]}},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true},"status":{"phase":"Running","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2020-11-16T14:19:37Z"},{"type":"Ready","status":"True","lastProbeTime":null,"lastTransitionTime":"2020-11-16T14:19:42Z"},{"type":"ContainersReady","status":"True","lastProbeTime":null,"lastTransitionTime":"2020-11-16T14:19:42Z"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2020-11-16T14:19:36Z"}],"hostIP":"172.18.0.3","podIP":"10.244.1.19","podIPs":[{"ip":"10.244.1.19"}],"startTime":"2020-11-16T14:19:37Z","containerStatuses":[{"name":"eventing-webhook","state":{"running":{"startedAt":"2020-11-16T14:19:41Z"}},"lastState":{},"ready":true,"restartCount":0,"image":"kind.local/knative.dev/eventing/cmd/webhook:1af4fd82f9a9ff68e3f5768dda777cabfe0e349429cf8289bdc3f32b533b60a4","imageID":"sha256:a5bffea29ff5b9b24ad286ce1981725ff772e6981a6e246583282cd96e094715","containerID":"containerd://3111ec724700c26c3191da72d020c217706ad6bea200668f0caac75882561733","started":true}],"qosClass":"Burstable"}}]
Yet the test failed with:
F1116 14:20:27.856761 27973 helpers.go:115] error: timed out waiting for the condition on pods/eventing-webhook-6bd5798587-4zv5s
goroutine 1 [running]:
Does this look like: https://github.com/knative/eventing/issues/3244
In knative-gcp, we will get crashed webhook, but maybe knative eventing automatically restart?
@zhongduo I don't think so because the webhook becomes ready.
@zhongduo I don't think so because the webhook becomes ready.
But as you said, it is a different pod already. So it might as well be that we have some logic to detect the crash or unreadiness and restart the pod, which accidentally will solve the problem.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
This should've been fixed by: https://github.com/knative/eventing/pull/4741
Let's reopen if it comes back.
Describe the bug Eventing webhook does not sometimes become ready, looks like maybe the specific one that the wait loop is waiting for gets replaced (maybe because of chaos duck?) by another pod that does become ready.
From one example here: https://github.com/knative/eventing/pull/4492/checks?check_run_id=1381350649
Then when the artifacts are dumped, note a different webhook pod comes up:
Expected behavior tests to not fail due to test setup failures.
To Reproduce Look at some of these failing tests here: https://github.com/knative/eventing/actions?query=workflow%3A%22KinD+e2e+tests%22
Knative release version head
Additional context Add any other context about the problem here such as proposed priority