oomichi / try-kubernetes

12 stars 5 forks source link

3 / 188 Conformance tests failed on k8s v1.12.2 cluster #63

Closed oomichi closed 5 years ago

oomichi commented 5 years ago

本問題の予想再現シナリオ

  1. 旧バージョンで e2e conformance test を実行
  2. バージョンアップを実施
  3. 新バージョンで e2e conformance test を実行

実行時間: 1時間46分 失敗テスト

Update Demo [It] should create and stop a replication controller
Update Demo [It] should scale a replication controller
Update Demo [It] should do a rolling update of a replication controller

新規 v1.13.1 環境だと成功している。 実行時間は2時間7分

$ export KUBERNETES_CONFORMANCE_TEST=true
$ cd go/src/k8s.io/kubernetes
$ go run hack/e2e.go --provider=openstack --test --test_args="--ginkgo.focus=\[Conformance\] --master-os-distro=ubuntu --node-os-distro=ubuntu" --check-version-skew=false
...
Ran 200 of 2118 Specs in 7676.703 seconds
SUCCESS! -- 200 Passed | 0 Failed | 0 Pending | 1918 Skipped PASS

Ginkgo ran 1 suite in 2h7m57.132843756s
Test Suite Passed
2018/12/18 20:56:42 process.go:155: Step './hack/ginkgo-e2e.sh --ginkgo.focus=\[Conformance\] --master-os-distro=ubuntu --node-os-distro=ubuntu' finished in 2h7m57.315797585s
2018/12/18 20:56:42 e2e.go:83: Done
oomichi commented 5 years ago
$ go run hack/e2e.go --provider=skeleton --test --test_args="--ginkgo.focus=should\screate\sand\sstop\sa\sreplication\scontroller --master-os-distro=ubuntu --node-os-distro=ubuntu" --check-version-skew=false

<snip>

~ Failure [312.939 seconds]
[sig-cli] Kubectl client
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl/framework.go:22
  [k8s.io] Update Demo
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:694
    should create and stop a replication controller  [Conformance] [It]
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:699

    Nov 13 23:22:20.185: Timed out after 300 seconds waiting for name=update-demo pods to reach valid state

    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/rc_util.go:292
oomichi commented 5 years ago

テストコード → これ自体はだいぶ前に Conformance になったみたい。よって、テストではなく環境が壊れている? test/e2e/kubectl/kubectl.go

 303                 framework.ConformanceIt("should create and stop a replication controller ", func() {
 304                         defer cleanupKubectlInputs(nautilus, ns, updateDemoSelector)
 305
 306                         By("creating a replication controller")
 307                         framework.RunKubectlOrDieInput(nautilus, "create", "-f", "-", fmt.Sprintf("--namespace=%v", ns))
 308                         framework.ValidateController(c, nautilusImage, 2, "update-demo", updateDemoSelector, getUDData("nautilus.jpg", ns), ns)
 309                 })

最後のValidateController で失敗している。

231 // ValidateController is a generic mechanism for testing RC's that are running.
232 // It takes a container name, a test name, and a validator function which is plugged in by a specific test.
233 // "containername": this is grepped for.
234 // "containerImage" : this is the name of the image we expect to be launched.  Not to confuse w/ images (kitten.jpg)  which are validated.
235 // "testname":  which gets bubbled up to the logging/failure messages if errors happen.
236 // "validator" function: This function is given a podID and a client, and it can do some specific validations that way.
237 func ValidateController(c clientset.Interface, containerImage string, replicas int, containername string, testname string, validator validatorFn, ns stri    ng) {
238         containerImage = trimDockerRegistry(containerImage)
239         getPodsTemplate := "--template={{range.items}}{{.metadata.name}} {{end}}"
240         // NB: kubectl adds the "exists" function to the standard template functions.
241         // This lets us check to see if the "running" entry exists for each of the containers
242         // we care about. Exists will never return an error and it's safe to check a chain of
243         // things, any one of which may not exist. In the below template, all of info,
244         // containername, and running might be nil, so the normal index function isn't very
245         // helpful.
246         // This template is unit-tested in kubectl, so if you change it, update the unit test.
247         // You can read about the syntax here: http://golang.org/pkg/text/template/.
248         getContainerStateTemplate := fmt.Sprintf(`--template={{if (exists . "status" "containerStatuses")}}{{range .status.containerStatuses}}{{if (and (    eq .name "%s") (exists . "state" "running"))}}true{{end}}{{end}}{{end}}`, containername)
249
250         getImageTemplate := fmt.Sprintf(`--template={{if (exists . "status" "containerStatuses")}}{{range .status.containerStatuses}}{{if eq .name "%s"}}    {{.image}}{{end}}{{end}}{{end}}`, containername)
251
252         By(fmt.Sprintf("waiting for all containers in %s pods to come up.", testname)) //testname should be selector
253 waitLoop:
254         for start := time.Now(); time.Since(start) < PodStartTimeout; time.Sleep(5 * time.Second) {
255                 getPodsOutput := RunKubectlOrDie("get", "pods", "-o", "template", getPodsTemplate, "-l", testname, fmt.Sprintf("--namespace=%v", ns))
256                 pods := strings.Fields(getPodsOutput)
257                 if numPods := len(pods); numPods != replicas {
258                         By(fmt.Sprintf("Replicas for %s: expected=%d actual=%d", testname, replicas, numPods))
259                         continue
260                 }
261                 var runningPods []string
262                 for _, podID := range pods {
263                         running := RunKubectlOrDie("get", "pods", podID, "-o", "template", getContainerStateTemplate, fmt.Sprintf("--namespace=%v", ns))
264                         if running != "true" {
265                                 Logf("%s is created but not running", podID)
266                                 continue waitLoop
267                         }
268
269                         currentImage := RunKubectlOrDie("get", "pods", podID, "-o", "template", getImageTemplate, fmt.Sprintf("--namespace=%v", ns))
270                         currentImage = trimDockerRegistry(currentImage)
271                         if currentImage != containerImage {
272                                 Logf("%s is created but running wrong image; expected: %s, actual: %s", podID, containerImage, currentImage)
273                                 continue waitLoop
274                         }
275
276                         // Call the generic validator function here.
277                         // This might validate for example, that (1) getting a url works and (2) url is serving correct content.
278                         if err := validator(c, podID); err != nil {
279                                 Logf("%s is running right image but validator function failed: %v", podID, err)
280                                 continue waitLoop
281                         }
282
283                         Logf("%s is verified up and running", podID)
284                         runningPods = append(runningPods, podID)
285                 }
286                 // If we reach here, then all our checks passed.
287                 if len(runningPods) == replicas {
288                         return
289                 }
290         }
291         // Reaching here means that one of more checks failed multiple times.  Assuming its not a race condition, something is broken.
292         Failf("Timed out after %v seconds waiting for %s pods to reach valid state", PodStartTimeout.Seconds(), testname)
293 }
oomichi commented 5 years ago

以下のようにimage 名が想定のものと異なるため、エラーとなった模様。 CPUアーキテクチャ名がイメージ名についている。 以前も似たようなことがあったなぁ。

Nov 14 02:13:01.926: INFO: update-demo-nautilus-rxzx5 is created but running wrong image;
expected: gcr.io/kubernetes-e2e-test-images/nautilus:1.0,
  actual: gcr.io/kubernetes-e2e-test-images/nautilus-amd64:1.0
oomichi commented 5 years ago

想定イメージ名 nautilusImage

  92         nautilusImage = imageutils.GetE2EImage(imageutils.Nautilus)

test/utils/image/manifest.go

Nautilus                 = ImageConfig{e2eRegistry, "nautilus", "1.0"}

GCP上はアーキテクチャ名のあり・無し両方のイメージがある。 capture

oomichi commented 5 years ago

手元で deployment を作成した場合は Image 名は指定したものになっている。

$ kubectl run test01 --image=gcr.io/kubernetes-e2e-test-images/nautilus:1.0
deployment.apps/test01 created
$ kubectl describe pod test01-69fc696fbf-7ncmn
Name:               test01-69fc696fbf-7ncmn
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               113-node01/192.168.1.103
Start Time:         Wed, 14 Nov 2018 02:41:40 +0000
Labels:             pod-template-hash=69fc696fbf
                    run=test01
Annotations:        <none>
Status:             Running
IP:                 10.244.1.108
Controlled By:      ReplicaSet/test01-69fc696fbf
Containers:
  test01:
    Container ID:   docker://ee5a65809f02b263e02d3dbdb84f3334470cbd85858c65795ca5c48409009b45
    Image:          gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Image ID:       docker://sha256:e0c9cf0dd72a7e2cafa14652c0152afcfaa773bb692e64c05f6ff5e34cecae33
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 14 Nov 2018 02:41:42 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wnjjs (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-wnjjs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wnjjs
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  21s   default-scheduler    Successfully assigned default/test01-69fc696fbf-7ncmn to 113-node01
  Normal  Pulled     19s   kubelet, 113-node01  Container image "gcr.io/kubernetes-e2e-test-images/nautilus:1.0" already present on machine
  Normal  Created    19s   kubelet, 113-node01  Created container
  Normal  Started    19s   kubelet, 113-node01  Started container
oomichi commented 5 years ago

テスト上の pod 作成処理

nautilus = substituteImageName(string(generated.ReadOrDie(filepath.Join(updateDemoRoot, "nautilus-rc.yaml.in"))))
..
framework.RunKubectlOrDieInput(nautilus, "create", "-f", "-", fmt.Sprintf("--namespace=%v", ns))

test/e2e/framework/util.go

2292 // RunKubectlOrDieInput is a convenience wrapper over kubectlBuilder that takes input to stdin
2293 func RunKubectlOrDieInput(data string, args ...string) string {
2294         return NewKubectlCommand(args...).WithStdinData(data).ExecOrDie()
2295 }

test/fixtures/doc-yaml/user-guide/update-demo/nautilus-rc.yaml.in

  1 apiVersion: v1
  2 kind: ReplicationController
  3 metadata:
  4   name: update-demo-nautilus
  5 spec:
  6   replicas: 2
  7   selector:
  8     name: update-demo
  9     version: nautilus
 10   template:
 11     metadata:
 12       labels:
 13         name: update-demo
 14         version: nautilus
 15     spec:
 16       containers:
 17       - image: {{.NautilusImage}}
 18         name: update-demo
 19         ports:
 20         - containerPort: 80
 21           protocol: TCP
oomichi commented 5 years ago

上記のマニフェストでやってみる。 → 再現せず。rc上もpod上も image は amd64 無しで正しく表示されている。

$ cat hoge.yaml
apiVersion: v1
kind: ReplicationController
metadata:
  name: update-demo-nautilus
spec:
  replicas: 2
  selector:
    name: update-demo
    version: nautilus
  template:
    metadata:
      labels:
        name: update-demo
        version: nautilus
    spec:
      containers:
      - image: gcr.io/kubernetes-e2e-test-images/nautilus:1.0
        name: update-demo
        ports:
        - containerPort: 80
          protocol: TCP
$ kubectl create -f hoge.yaml
replicationcontroller/update-demo-nautilus created
$ kubectl describe rc update-demo-nautilus
Name:         update-demo-nautilus
Namespace:    default
Selector:     name=update-demo,version=nautilus
Labels:       name=update-demo
              version=nautilus
Annotations:  <none>
Replicas:     2 current / 2 desired
Pods Status:  2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  name=update-demo
           version=nautilus
  Containers:
   update-demo:
    Image:        gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From                    Message
  ----    ------            ----  ----                    -------
  Normal  SuccessfulCreate  3m    replication-controller  Created pod: update-demo-nautilus-4bc4q
  Normal  SuccessfulCreate  3m    replication-controller  Created pod: update-demo-nautilus-7q62d
$ kubectl get pods
NAME                                             READY     STATUS    RESTARTS   AGE
nginx-64f497f8fd-wtpzf                           1/1       Running   0          7d18h
standalone-cinder-provisioner-7d6594d789-vtcbv   1/1       Running   0          33d
update-demo-nautilus-4bc4q                       1/1       Running   0          22s
update-demo-nautilus-7q62d                       1/1       Running   0          22s
$ kubectl describe pod update-demo-nautilus-4bc4q
Name:               update-demo-nautilus-4bc4q
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               113-node01/192.168.1.103
Start Time:         Wed, 14 Nov 2018 03:07:26 +0000
Labels:             name=update-demo
                    version=nautilus
Annotations:        <none>
Status:             Running
IP:                 10.244.1.109
Controlled By:      ReplicationController/update-demo-nautilus
Containers:
  update-demo:
    Container ID:   docker://f805e2beff7bbe4b61b7c40ca22e387a0491efe3df59fb22a0fdef293e5f8a28
    Image:          gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Image ID:       docker://sha256:e0c9cf0dd72a7e2cafa14652c0152afcfaa773bb692e64c05f6ff5e34cecae33
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 14 Nov 2018 03:07:28 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wnjjs (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-wnjjs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wnjjs
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  35s   default-scheduler    Successfully assigned default/update-demo-nautilus-4bc4q to 113-node01
  Normal  Pulled     34s   kubelet, 113-node01  Container image "gcr.io/kubernetes-e2e-test-images/nautilus:1.0" already present on machine
  Normal  Created    34s   kubelet, 113-node01  Created container
  Normal  Started    33s   kubelet, 113-node01  Started container
$ kubectl describe pod update-demo-nautilus-7q62d
Name:               update-demo-nautilus-7q62d
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               113-node02/192.168.1.112
Start Time:         Wed, 14 Nov 2018 03:07:26 +0000
Labels:             name=update-demo
                    version=nautilus
Annotations:        <none>
Status:             Running
IP:                 10.244.2.23
Controlled By:      ReplicationController/update-demo-nautilus
Containers:
  update-demo:
    Container ID:   docker://bf5e5a13e6f4fd3816a06b20c7f4503f0c1b99cf21e2476eaebea7c764304230
    Image:          gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Image ID:       docker://sha256:e0c9cf0dd72a7e2cafa14652c0152afcfaa773bb692e64c05f6ff5e34cecae33
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 14 Nov 2018 03:07:28 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wnjjs (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-wnjjs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wnjjs
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  1m    default-scheduler    Successfully assigned default/update-demo-nautilus-7q62d to 113-node02
  Normal  Pulled     1m    kubelet, 113-node02  Container image "gcr.io/kubernetes-e2e-test-images/nautilus:1.0" already present on machine
  Normal  Created    1m    kubelet, 113-node02  Created container
  Normal  Started    1m    kubelet, 113-node02  Started container
oomichi commented 5 years ago

テスト実施中の状態を取ってみた → Image名に amd64 は含まれないが、テストは何処を見ている!?

$ kubectl get rc -n e2e-tests-kubectl-zvthv
NAME                   DESIRED   CURRENT   READY     AGE
update-demo-nautilus   2         2         2         16s
ubuntu@e2e:~$ kubectl -n e2e-tests-kubectl-zvthv describe rc update-demo-nautilus
Name:         update-demo-nautilus
Namespace:    e2e-tests-kubectl-zvthv
Selector:     name=update-demo,version=nautilus
Labels:       name=update-demo
              version=nautilus
Annotations:  <none>
Replicas:     2 current / 2 desired
Pods Status:  2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  name=update-demo
           version=nautilus
  Containers:
   update-demo:
    Image:        gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From                    Message
  ----    ------            ----  ----                    -------
  Normal  SuccessfulCreate  30s   replication-controller  Created pod: update-demo-nautilus-rhkp4
  Normal  SuccessfulCreate  29s   replication-controller  Created pod: update-demo-nautilus-nz7jm
$
$ kubectl -n e2e-tests-kubectl-zvthv get pods
NAME                         READY     STATUS    RESTARTS   AGE
update-demo-nautilus-nz7jm   1/1       Running   0          36s
update-demo-nautilus-rhkp4   1/1       Running   0          36s
$
$ kubectl -n e2e-tests-kubectl-zvthv describe pod update-demo-nautilus-nz7jm
Name:               update-demo-nautilus-nz7jm
Namespace:          e2e-tests-kubectl-zvthv
Priority:           0
PriorityClassName:  <none>
Node:               113-node02/192.168.1.112
Start Time:         Wed, 14 Nov 2018 03:14:29 +0000
Labels:             name=update-demo
                    version=nautilus
Annotations:        <none>
Status:             Running
IP:                 10.244.2.24
Controlled By:      ReplicationController/update-demo-nautilus
Containers:
  update-demo:
    Container ID:   docker://a5ff2d030d91f967c16c9bd2165c402c0e18e267f9e4ab52e59529f9d16c143d
    Image:          gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Image ID:       docker://sha256:e0c9cf0dd72a7e2cafa14652c0152afcfaa773bb692e64c05f6ff5e34cecae33
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 14 Nov 2018 03:14:31 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-btlp2 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-btlp2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-btlp2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  47s   default-scheduler    Successfully assigned e2e-tests-kubectl-zvthv/update-demo-nautilus-nz7jm to 113-node02
  Normal  Pulled     45s   kubelet, 113-node02  Container image "gcr.io/kubernetes-e2e-test-images/nautilus:1.0" already present on machine
  Normal  Created    45s   kubelet, 113-node02  Created container
  Normal  Started    45s   kubelet, 113-node02  Started container
$
$ kubectl -n e2e-tests-kubectl-zvthv describe pod update-demo-nautilus-rhkp4
Name:               update-demo-nautilus-rhkp4
Namespace:          e2e-tests-kubectl-zvthv
Priority:           0
PriorityClassName:  <none>
Node:               113-node01/192.168.1.103
Start Time:         Wed, 14 Nov 2018 03:14:29 +0000
Labels:             name=update-demo
                    version=nautilus
Annotations:        <none>
Status:             Running
IP:                 10.244.1.110
Controlled By:      ReplicationController/update-demo-nautilus
Containers:
  update-demo:
    Container ID:   docker://6722659e58ccc522f8a01ce064628e15b05c751c61e2c42d74f7d3474f16645f
    Image:          gcr.io/kubernetes-e2e-test-images/nautilus:1.0
    Image ID:       docker://sha256:e0c9cf0dd72a7e2cafa14652c0152afcfaa773bb692e64c05f6ff5e34cecae33
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 14 Nov 2018 03:14:31 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-btlp2 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-btlp2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-btlp2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  59s   default-scheduler    Successfully assigned e2e-tests-kubectl-zvthv/update-demo-nautilus-rhkp4 to 113-node01
  Normal  Pulled     57s   kubelet, 113-node01  Container image "gcr.io/kubernetes-e2e-test-images/nautilus:1.0" already present on machine
  Normal  Created    57s   kubelet, 113-node01  Created container
  Normal  Started    57s   kubelet, 113-node01  Started container
oomichi commented 5 years ago

Podのイメージ名取得処理が特殊? 以下のようなコマンドでとっている模様

kubectl get pods <pod-id> -o --template={{if (exists . "status" "containerStatuses")}}{{range .status.containerStatuses}}{{if eq .name "update-demo"}}{{.image}}{{end}}{{end}}{{end}} -n <namespace>

templateオプションはgo の標準的なものを使っているみたい

--template='': Template string or path to template file to use when -o=go-template, -o=go-template-file.
The template format is golang templates [http://golang.org/pkg/text/template/#pkg-overview].
oomichi commented 5 years ago

新たに e2e テスト用の VM を作って実行してみたが、再現してしまった。 たぶん、k8s クラスタ側の問題だと思う。 根本原因を dig ったほうが良さそう。

oomichi commented 5 years ago

本問題の予想再現シナリオ

  1. 旧バージョンで e2e conformance test を実行
  2. バージョンアップを実施
  3. 新バージョンで e2e conformance test を実行
oomichi commented 5 years ago

v.1.12.x から v.13.1 でも再現するか試してみる。

# apt-get install -y kubelet=1.12.4-00 kubeadm=1.12.4-00 kubectl=1.12.4-00 kubernetes-cni
  1. v1.12.4 での e2e テスト 188 項目、全て通った。1時間53分
  2. アップグレードを実施 https://github.com/oomichi/try-kubernetes/issues/67 完了
  3. v1.13.1 での e2e テストを実施、問題なし。1時間57分