oomichi / try-kubernetes

12 stars 5 forks source link

[It] validates resource limits of pods that are allowed to run [Conformance] failed #88

Closed oomichi closed 3 years ago

oomichi commented 5 years ago

After running conformance test of the latest master branch against Kubernetes cluster of v1.15.1, the test was failed like

I0725 23:54:16.772743    2143 round_trippers.go:416] GET https://192.168.1.118:6443/apis/batch/v1/namespaces/sched-pred-379/jobs
I0725 23:54:16.772754    2143 round_trippers.go:423] Request Headers:
I0725 23:54:16.772758    2143 round_trippers.go:426]     Accept: application/json
I0725 23:54:16.772771    2143 round_trippers.go:426]     User-Agent: e2e.test/v1.16.0 (linux/amd64) kubernetes/69a34f6 -- [sig-scheduling] SchedulerPredicates [Serial] validates resource li
mits of pods that are allowed to run  [Conformance]
Jul 25 23:54:16.774: INFO: namespace sched-pred-379 deletion completed in 30.079073074s
[AfterEach] [sig-scheduling] SchedulerPredicates [Serial]
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:73

ESC[91mESC[1m~ Failure [156.541 seconds]ESC[0m
[sig-scheduling] SchedulerPredicates [Serial]
ESC[90m/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/framework.go:23ESC[0m
  ESC[91mESC[1mvalidates resource limits of pods that are allowed to run  [Conformance] [It]ESC[0m
  ESC[90m/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:698ESC[0m

  ESC[91mUnexpected error:
      <*errors.errorString | 0xc000289cd0>: {
          s: "timed out waiting for the condition",
      }
      timed out waiting for the condition
  occurredESC[0m

  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:736
ESC[90m------------------------------ESC[0m
ESC[36mSESC[0mESC[36mSESC[0mESC[36mSESC[0mESC[36mSESC[0m
ESC[90m------------------------------ESC[0m
ESC[0m[sig-scheduling] SchedulerPredicates [Serial]ESC[0m
  ESC[1mvalidates that NodeSelector is respected if matching  [Conformance]ESC[0m
  ESC[37m/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:698ESC[0m
oomichi commented 5 years ago

The file is test/e2e/scheduling/predicates.go

237         framework.ConformanceIt("validates resource limits of pods that are allowed to run ", func() {
238                 framework.WaitForStableCluster(cs, masterNodes)
239                 nodeMaxAllocatable := int64(0)
oomichi commented 5 years ago

The log contains Creating another pod that requires unavailable amount of CPU.

318                 ginkgo.By("Creating another pod that requires unavailable amount of CPU.")
319                 // Create another pod that requires 50% of the largest node CPU resources.
320                 // This pod should remain pending as at least 70% of CPU of other nodes in
321                 // the cluster are already consumed.
322                 podName := "additional-pod"
323                 conf := pausePodConfig{
324                         Name:   podName,
325                         Labels: map[string]string{"name": "additional"},
326                         Resources: &v1.ResourceRequirements{
327                                 Limits: v1.ResourceList{
328                                         v1.ResourceCPU: *resource.NewMilliQuantity(nodeMaxAllocatable*5/10, "DecimalSI"),
329                                 },
330                         },
331                 }
332                 WaitForSchedulerAfterAction(f, createPausePodAction(f, conf), ns, podName, false)
333                 verifyResult(cs, len(fillerPods), 1, ns)
334         })
oomichi commented 5 years ago

よくよく見たらテストの途中でタイムアウトしていた

STEP: Creating another pod that requires unavailable amount of CPU.
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0117600c6], Reason = [Scheduled], Message = [Successfully assigned sched-pred-2053/filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d to k8s-node01]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0915b8194], Reason = [Pulled], Message = [Container image "k8s.gcr.io/pause:3.1" already present on machine]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0d84d55b8], Reason = [Created], Message = [Created container filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0ee543c4c], Reason = [Started], Message = [Started container filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab17af85629], Reason = [Scheduled], Message = [Successfully assigned sched-pred-2053/additional-pod to k8s-master]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab2058fa2d4], Reason = [Pulled], Message = [Container image "k8s.gcr.io/pause:3.1" already present on machine]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab247826d46], Reason = [Created], Message = [Created container additional-pod]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab263709171], Reason = [Started], Message = [Started container additional-pod]

Jul 27 00:05:15.108: INFO: Unexpected error occurred: timed out waiting for the condition
STEP: removing the label node off the node k8s-node01
STEP: verifying the node doesn't have the label node
oomichi commented 5 years ago
318                 ginkgo.By("Creating another pod that requires unavailable amount of CPU.")
319                 // Create another pod that requires 50% of the largest node CPU resources.
320                 // This pod should remain pending as at least 70% of CPU of other nodes in
321                 // the cluster are already consumed.
322                 podName := "additional-pod"
323                 conf := pausePodConfig{
324                         Name:   podName,
325                         Labels: map[string]string{"name": "additional"},
326                         Resources: &v1.ResourceRequirements{
327                                 Limits: v1.ResourceList{
328                                         v1.ResourceCPU: *resource.NewMilliQuantity(nodeMaxAllocatable*5/10, "DecimalSI"),
329                                 },
330                         },
331                 }
332                 WaitForSchedulerAfterAction(f, createPausePodAction(f, conf), ns, podName, false)
333                 verifyResult(cs, len(fillerPods), 1, ns)
334         })
...
728 // WaitForSchedulerAfterAction performs the provided action and then waits for
729 // scheduler to act on the given pod.
730 func WaitForSchedulerAfterAction(f *framework.Framework, action common.Action, ns, podName string, expectSuccess bool) {
731         predicate := scheduleFailureEvent(podName)
732         if expectSuccess {
733                 predicate = scheduleSuccessEvent(ns, podName, "" /* any node */)
734         }
735         success, err := common.ObserveEventAfterAction(f, predicate, action)
736         framework.ExpectNoError(err)
737         framework.ExpectEqual(success, true)
738 }

のどこかで Considering event: を出力していて、そこでタイムアウトになっているっぽい。 test/e2e/common/events.go

 98 func ObserveEventAfterAction(f *framework.Framework, eventPredicate func(*v1.Event) bool, action Action) (bool, error) {
 99         observedMatchingEvent := false
100         informerStartedChan := make(chan struct{})
101         var informerStartedGuard sync.Once
102
103         // Create an informer to list/watch events from the test framework namespace.
104         _, controller := cache.NewInformer(
105                 &cache.ListWatch{
106                         ListFunc: func(options metav1.ListOptions) (runtime.Object, error) {
107                                 ls, err := f.ClientSet.CoreV1().Events(f.Namespace.Name).List(options)
108                                 return ls, err
109                         },
110                         WatchFunc: func(options metav1.ListOptions) (watch.Interface, error) {
111                                 // Signal parent goroutine that watching has begun.
112                                 defer informerStartedGuard.Do(func() { close(informerStartedChan) })
113                                 w, err := f.ClientSet.CoreV1().Events(f.Namespace.Name).Watch(options)
114                                 return w, err
115                         },
116                 },
117                 &v1.Event{},
118                 0,
119                 cache.ResourceEventHandlerFuncs{
120                         AddFunc: func(obj interface{}) {
121                                 e, ok := obj.(*v1.Event)
122                                 ginkgo.By(fmt.Sprintf("Considering event: \nType = [%s], Name = [%s], Reason = [%s], Message = [%s]", e.Type, e.Name, e.Reason, e.Message))
123                                 framework.ExpectEqual(ok, true)
124                                 if ok && eventPredicate(e) {
125                                         observedMatchingEvent = true
126                                 }
127                         },
128                 },
129         )
130
131         // Start the informer and block this goroutine waiting for the started signal.
132         informerStopChan := make(chan struct{})
133         defer func() { close(informerStopChan) }()
134         go controller.Run(informerStopChan)
135         <-informerStartedChan

つまり、問題のテスト -> WaitForSchedulerAfterAction() -> ObserveEventAfterAction() というコールシーケンス ObserveEventAfterAction() は与えられたアクションを行い、スケジューラが指定された Pod でアクションを行うことを待つ。今回はそこでタイムアウトした?

oomichi commented 5 years ago

Pod の状態は以下の通り遷移していた。 filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0117600c6: Scheduled -> Pulled -> Created -> Started additional-pod.15b51ab17af85629: Scheduled -> Pulled -> Created -> Started

TEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0117600c6], Reason = [Scheduled], Message = [Successfully assigned sched-pred-2053/filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d to k8s-node01]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0915b8194], Reason = [Pulled], Message = [Container image "k8s.gcr.io/pause:3.1" already present on machine]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0d84d55b8], Reason = [Created], Message = [Created container filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d]
STEP: Considering event:
Type = [Normal], Name = [filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d.15b51ab0ee543c4c], Reason = [Started], Message = [Started container filler-pod-87b2a116-8690-4fa5-9963-d6238fc6f99d]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab17af85629], Reason = [Scheduled], Message = [Successfully assigned sched-pred-2053/additional-pod to k8s-master]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab2058fa2d4], Reason = [Pulled], Message = [Container image "k8s.gcr.io/pause:3.1" already present on machine]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab247826d46], Reason = [Created], Message = [Created container additional-pod]
STEP: Considering event:
Type = [Normal], Name = [additional-pod.15b51ab263709171], Reason = [Started], Message = [Started container additional-pod]
oomichi commented 5 years ago
122                                 ginkgo.By(fmt.Sprintf("Considering event: \nType = [%s], Name = [%s], Reason = [%s], Message = [%s]", e.Type, e.Name, e.Reason, e.Message))
123                                 framework.ExpectEqual(ok, true)
124                                 if ok && eventPredicate(e) {
125                                         observedMatchingEvent = true
126                                 }

で observedMatchingEvent が true にならないのが問題。 ok はその前の123 行目を通っていることから true は確定している。 よって、eventPredicate(e) が true にならないことが問題。

oomichi commented 5 years ago

eventPredicate は関数呼び出しの際に渡されるもの。 predicate := scheduleFailureEvent(podName) がそれに該当する。

 35 func scheduleFailureEvent(podName string) func(*v1.Event) bool {
 36         return func(e *v1.Event) bool {
 37                 return strings.HasPrefix(e.Name, podName) &&
 38                         e.Type == "Warning" &&
 39                         e.Reason == "FailedScheduling"
 40         }
 41 }
oomichi commented 5 years ago

よって、ここでの問題はイベントの期待値が Warning で FailedScheduling だが、実際には Normal で Scheduled になっているため、タイムアウトしている。

そもそもテストは異常状態を期待するものだったのか? 下記のコメントからすると異常状態を期待するものだったっぽい

318                 ginkgo.By("Creating another pod that requires unavailable amount of CPU.")
319                 // Create another pod that requires 50% of the largest node CPU resources.
320                 // This pod should remain pending as at least 70% of CPU of other nodes in
321                 // the cluster are already consumed.
不可能な量の CPU を要求するもう一つの Pod を作成する。
node の CPU の50%の CPUリソースを要求する別の Pod を作成する。
この Pod は Pending 状態であるべきだ。少なくとも 70%のCPUを既に消費しているクラスタ上で作成しようとしているため
oomichi commented 5 years ago

Kubernetes のノード毎の CPU リソースを確認する方法

kubectl top node で取れるはずだが、metrics-server をインストールしていないので以下のようにエラーになる。

$ kubectl top node
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)

そもそも、テスト自体も CPU リソースの状況が確認できないと、想定負荷を与えられないのでは? metrics-server をインストールしたあとに再度実行してみる。

https://github.com/oomichi/try-kubernetes#enable-metrics-server-for-hpa を k8s v1.15.1 に対して実施。 取得できるようになった

$ kubectl top node
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
k8s-master   113m         5%     1271Mi          33%
k8s-node01   116m         5%     902Mi           23%
oomichi commented 5 years ago

metrics-server インストール後、サイドテストを実行してみたが再現 テスト実行中の CPU 使用量もかなり低い

$ kubectl top node
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
k8s-master   124m         6%     1251Mi          32%
k8s-node01   117m         5%     906Mi           23%
oomichi commented 5 years ago

テストにおける CPU 負荷のかけ方を確認する

oomichi commented 5 years ago

たぶん、上記推測は誤り。 k8s コントローラとして、最初の負荷 Pod でリソースを占有で割当てていて実際に使用していなくても(CPU負荷がかかっていなくても)、それ以上の Pod は割り当たらないことを想定している模様。 よって、ノードごとの CPU 負荷を見るのではなく、ノードごとの占有リソースを把握する方法を知る必要がある。

oomichi commented 5 years ago

master が Workload をホスト出来ることが原因なのかを切り分ける

  1. Taints が master についていない状態(ホスト可能状態)
    
    $ kubectl describe node k8s-master | grep Taint
    Taints:             <none>
    $ go run hack/e2e.go -- --provider=skeleton --test --test_args="--ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed" --check-version-skew=false
    ...
    Summarizing 1 Failure:

[Fail] [sig-scheduling] SchedulerPredicates [Serial] [It] validates resource limits of pods that are allowed to run [Conformance] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/predicates.go:738

Ran 1 of 4798 Specs in 142.835 seconds FAIL! -- 0 Passed | 1 Failed | 0 Pending | 4797 Skipped --- FAIL: TestE2E (142.85s)

2. Taints を master につけた状態(ホスト不可能状態)

$ kubectl taint nodes k8s-master key=value:NoSchedule $ kubectl describe node k8s-master | grep Taint Taints: key=value:NoSchedule $ go run hack/e2e.go -- --provider=skeleton --test --test_args="--ginkgo.focus=validates\sresource\slimits\sof\spods\sthat\sare\sallowed" --check-version-skew=false ... Ran 1 of 4798 Specs in 13.629 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 4797 Skipped PASS

oomichi commented 5 years ago

想定どおり、Taints が master についているかどうかが原因の模様。 k/kubernetes/issues/81103 として登録した。 今後はそちらで作業を続ける。

oomichi commented 5 years ago

今のPRが統合されても、引き続き問題が起きる。

~ Failure [21.715 seconds]
[sig-scheduling] SchedulerPredicates [Serial]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/scheduling/framework.go:23
  validates resource limits of pods that are allowed to run  [Conformance] [It]
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:699

  Expected
      <int>: 1
  to equal
      <int>: 2

  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:1359

想定していたScheduledされるPod数が2だったが、1しかスケジュールされなかったためにエラーになっている。

752 // verifyReplicasResult is wrapper of verifyResult for a group pods with same "name: labelName" label, which means they belong to same RC
753 func verifyReplicasResult(c clientset.Interface, expectedScheduled int, expectedNotScheduled int, ns string, labelName string) {
754         allPods := getPodsByLabels(c, ns, map[string]string{"name": labelName})
755         scheduledPods, notScheduledPods := e2epod.GetPodsScheduled(masterNodes, allPods)
756
757         framework.ExpectEqual(len(notScheduledPods), expectedNotScheduled, fmt.Sprintf("Not scheduled Pods: %#v", notScheduledPods))
758         framework.ExpectEqual(len(scheduledPods), expectedScheduled, fmt.Sprintf("Scheduled Pods: %#v", scheduledPods))
759 }
...
334                 WaitForSchedulerAfterAction(f, createPausePodAction(f, conf), ns, podName, false)
335                 verifyResult(cs, len(fillerPods), 1, ns)

つまり len(fillerPods) が2だった。 GetPodsScheduled が問題。

oomichi commented 5 years ago

そもそも Conformance Testとしてmaster nodeがTainted されていない環境をサポートするべきか、議論中。 https://kubernetes.slack.com/archives/C78F00H99/p1565312069118600