Closed oomichi closed 6 years ago
このテスト自体は Conformance test として 2017年3月から存在している。 前回 v1.10 のときは通っていたので、今回の環境から失敗するようになった模様。
e2e テストの残骸 namespaces が残っていたから? エラーで出力された namespaces を削除してみる
$ kubectl get namespaces
NAME STATUS AGE
default Active 21d
e2e-tests-horizontal-pod-autoscaling-cndn6 Active 20h
e2e-tests-horizontal-pod-autoscaling-fcq2t Active 3h
e2e-tests-horizontal-pod-autoscaling-fhrzw Active 21h
e2e-tests-horizontal-pod-autoscaling-k9d6r Active 4h
e2e-tests-horizontal-pod-autoscaling-qbghv Active 3d
e2e-tests-horizontal-pod-autoscaling-rhnzt Active 3d
kube-public Active 21d
kube-system Active 21d
$ kubectl delete namespace e2e-tests-horizontal-pod-autoscaling-cndn6
...
~ Failure in Spec Setup (BeforeEach) [66.439 seconds]
[sig-scheduling] SchedulerPredicates [Serial]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/framework.go:22
validates resource limits of pods that are allowed to run [Conformance] [BeforeEach]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:684
Expected error:
<*errors.errorString | 0xc420a6ca00>: {
s: "Namespace e2e-tests-horizontal-pod-autoscaling-fcq2t is active",
}
Namespace e2e-tests-horizontal-pod-autoscaling-fcq2t is active
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:89
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSAug 7 21:04:16.153: INFO: Running AfterSuite actions on all node
Aug 7 21:04:16.153: INFO: Running AfterSuite actions on node 1
Summarizing 1 Failure:
[Fail] [sig-scheduling] SchedulerPredicates [Serial] [BeforeEach] validates resource limits of pods that are allowed to run [Conformance]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:89
Ran 1 of 999 Specs in 66.525 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 998 Skipped --- FAIL: TestE2E (66.56s)
FAIL
Ginkgo ran 1 suite in 1m6.76832776s
Test Suite Failed
!!! Error in ./hack/ginkgo-e2e.sh:143
Error in ./hack/ginkgo-e2e.sh:143. '"${ginkgo}" "${ginkgo_args[@]:+${ginkgo_args[@]}}" "${e2e_test}" -- "${auth_config[@]:+${auth_config[@]}}" --ginkgo.flakeAttempts="${FLAKE_ATTEMPTS}" --host="${KUBE_MASTER_URL}" --provider="${KUBERNETES_PROVIDER}" --gce-project="${PROJECT:-}" --gce-zone="${ZONE:-}" --gce-region="${REGION:-}" --gce-multizone="${MULTIZONE:-false}" --gke-cluster="${CLUSTER_NAME:-}" --kube-master="${KUBE_MASTER:-}" --cluster-tag="${CLUSTER_ID:-}" --cloud-config-file="${CLOUD_CONFIG:-}" --repo-root="${KUBE_ROOT}" --node-instance-group="${NODE_INSTANCE_GROUP:-}" --prefix="${KUBE_GCE_INSTANCE_PREFIX:-e2e}" --network="${KUBE_GCE_NETWORK:-${KUBE_GKE_NETWORK:-e2e}}" --node-tag="${NODE_TAG:-}" --master-tag="${MASTER_TAG:-}" --cluster-monitoring-mode="${KUBE_ENABLE_CLUSTER_MONITORING:-standalone}" --prometheus-monitoring="${KUBE_ENABLE_PROMETHEUS_MONITORING:-false}" ${KUBE_CONTAINER_RUNTIME:+"--container-runtime=${KUBE_CONTAINER_RUNTIME}"} ${MASTER_OS_DISTRIBUTION:+"--master-os-distro=${MASTER_OS_DISTRIBUTION}"} ${NODE_OS_DISTRIBUTION:+"--node-os-distro=${NODE_OS_DISTRIBUTION}"} ${NUM_NODES:+"--num-nodes=${NUM_NODES}"} ${E2E_REPORT_DIR:+"--report-dir=${E2E_REPORT_DIR}"} ${E2E_REPORT_PREFIX:+"--report-prefix=${E2E_REPORT_PREFIX}"} "${@:-}"' exited with status 1
Call stack:
1: ./hack/ginkgo-e2e.sh:143 main(...)
Exiting with status 1
2018/08/07 21:04:16 process.go:155: Step './hack/ginkgo-e2e.sh --ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed\sto\srun' finished in 1m6.974759394s
2018/08/07 21:04:16 main.go:309: Something went wrong: encountered 1 errors: [error during ./hack/ginkgo-e2e.sh --ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed\sto\srun: exit status 1]
2018/08/07 21:04:16 e2e.go:81: err: exit status 1
exit status 1
ほかの残骸 namespaces があったせいで、エラーになった。 残骸 namespaces を全て削除して再実行してみる。 → timeout エラーが発生する
$ kubectl delete namespace e2e-tests-horizontal-pod-autoscaling-k9d6r e2e-tests-horizontal-pod-autoscaling-qbghv e2e-tests-horizontal-pod-autoscaling-rhnzt
...
$ kubectl get namespaces
NAME STATUS AGE
default Active 21d
kube-public Active 21d
kube-system Active 21d
$
$ go run hack/e2e.go -- --provider=skeleton --test --test_args="--ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed\sto\srun" --check-version-skew=false
...
Latency metrics for node k8s-node01
STEP: Dumping a list of prepulled images on each node...
Aug 7 21:09:23.359: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "e2e-tests-sched-pred-5snqq" for this suite.
Aug 7 21:09:41.390: INFO: Waiting up to 30s for server preferred namespaced resources to be successfully discovered
Aug 7 21:09:41.459: INFO: namespace: e2e-tests-sched-pred-5snqq, resource: bindings, ignored listing per whitelist
Aug 7 21:09:41.510: INFO: namespace e2e-tests-sched-pred-5snqq deletion completed in 18.146813186s
[AfterEach] [sig-scheduling] SchedulerPredicates [Serial]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:71
~ Failure [202.622 seconds]
[sig-scheduling] SchedulerPredicates [Serial]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/framework.go:22
validates resource limits of pods that are allowed to run [Conformance] [It]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:684
Expected error:
<*errors.errorString | 0xc420085550>: {
s: "timed out waiting for the condition",
}
timed out waiting for the condition
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:730
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSAug 7 21:09:41.512: INFO: Running AfterSuite actions on all node
Aug 7 21:09:41.512: INFO: Running AfterSuite actions on node 1
Summarizing 1 Failure:
[Fail] [sig-scheduling] SchedulerPredicates [Serial] [It] validates resource limits of pods that are allowed to run [Conformance]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:730
Ran 1 of 999 Specs in 202.681 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 998 Skipped --- FAIL: TestE2E (202.71s)
FAIL
Ginkgo ran 1 suite in 3m22.92137333s
Test Suite Failed
!!! Error in ./hack/ginkgo-e2e.sh:143
Error in ./hack/ginkgo-e2e.sh:143. '"${ginkgo}" "${ginkgo_args[@]:+${ginkgo_args[@]}}" "${e2e_test}" -- "${auth_config[@]:+${auth_config[@]}}" --ginkgo.flakeAttempts="${FLAKE_ATTEMPTS}" --host="${KUBE_MASTER_URL}" --provider="${KUBERNETES_PROVIDER}" --gce-project="${PROJECT:-}" --gce-zone="${ZONE:-}" --gce-region="${REGION:-}" --gce-multizone="${MULTIZONE:-false}" --gke-cluster="${CLUSTER_NAME:-}" --kube-master="${KUBE_MASTER:-}" --cluster-tag="${CLUSTER_ID:-}" --cloud-config-file="${CLOUD_CONFIG:-}" --repo-root="${KUBE_ROOT}" --node-instance-group="${NODE_INSTANCE_GROUP:-}" --prefix="${KUBE_GCE_INSTANCE_PREFIX:-e2e}" --network="${KUBE_GCE_NETWORK:-${KUBE_GKE_NETWORK:-e2e}}" --node-tag="${NODE_TAG:-}" --master-tag="${MASTER_TAG:-}" --cluster-monitoring-mode="${KUBE_ENABLE_CLUSTER_MONITORING:-standalone}" --prometheus-monitoring="${KUBE_ENABLE_PROMETHEUS_MONITORING:-false}" ${KUBE_CONTAINER_RUNTIME:+"--container-runtime=${KUBE_CONTAINER_RUNTIME}"} ${MASTER_OS_DISTRIBUTION:+"--master-os-distro=${MASTER_OS_DISTRIBUTION}"} ${NODE_OS_DISTRIBUTION:+"--node-os-distro=${NODE_OS_DISTRIBUTION}"} ${NUM_NODES:+"--num-nodes=${NUM_NODES}"} ${E2E_REPORT_DIR:+"--report-dir=${E2E_REPORT_DIR}"} ${E2E_REPORT_PREFIX:+"--report-prefix=${E2E_REPORT_PREFIX}"} "${@:-}"' exited with status 1
Call stack:
1: ./hack/ginkgo-e2e.sh:143 main(...)
Exiting with status 1
2018/08/07 21:09:41 process.go:155: Step './hack/ginkgo-e2e.sh --ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed\sto\srun' finished in 3m23.132616477s
2018/08/07 21:09:41 main.go:309: Something went wrong: encountered 1 errors: [error during ./hack/ginkgo-e2e.sh --ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed\sto\srun: exit status 1]
2018/08/07 21:09:41 e2e.go:81: err: exit status 1
exit status 1
AfterEach で失敗しているから、テストのメイン部分は正常動作して、後始末処理で失敗?
test/e2e/scheduling/predicates.go:730
722 // WaitForSchedulerAfterAction performs the provided action and then waits for
723 // scheduler to act on the given pod.
724 func WaitForSchedulerAfterAction(f *framework.Framework, action common.Action, ns, podName string, expectSuccess bool) {
725 predicate := scheduleFailureEvent(podName)
726 if expectSuccess {
727 predicate = scheduleSuccessEvent(ns, podName, "" /* any node */)
728 }
729 success, err := common.ObserveEventAfterAction(f, predicate, action)
730 Expect(err).NotTo(HaveOccurred())
731 Expect(success).To(Equal(true))
732 }
ObserveEventAfterAction は上記の1箇所でしか呼ばれていない・・ test/e2e/common/events.go
99 func ObserveEventAfterAction(f *framework.Framework, eventPredicate func(*v1.Event) bool, action Action) (bool, error) {
...
144 // Poll whether the informer has found a matching event with a timeout.
145 // Wait up 2 minutes polling every second.
146 timeout := 2 * time.Minute
147 interval := 1 * time.Second
148 err = wait.Poll(interval, timeout, func() (bool, error) {
149 return observedMatchingEvent, nil
150 })
151 return err == nil, err timeoutに関連しそうなエラー処理はここくらい
152 }
クリーンデプロイ環境で発生しなくなった。
Conformanceテスト失敗原因調査 https://github.com/oomichi/try-kubernetes/issues/36 の一部
まとめ
テスト目的は「Pods に対するリソース制限が正しく動作すること」を確認すること
テストログ