oomichi / try-kubernetes

12 stars 5 forks source link

[sig-storage] CSI Volumes CSI plugin test using CSI driver: hostPath #53

Closed oomichi closed 5 years ago

oomichi commented 5 years ago

まとめ

STEP: deploying csi hostpath driver
STEP: Creating a CSI service account for hostpath
STEP: Binding cluster roles [system:csi-external-attacher system:csi-external-provisioner csi-driver-registrar] to the CSI service account csi-hostpath-service-account
Sep 28 23:59:40.743: INFO: Deleting pod "csi-pod" in namespace "e2e-tests-csi-mock-plugin-tbmdn"
[It] should provision storage
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:96
STEP: creating a StorageClass e2e-tests-csi-mock-plugin-tbmdn-sc
STEP: creating a claim
Sep 28 23:59:50.859: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-q84nb to have phase Bound
Sep 28 23:59:50.861: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
Sep 28 23:59:52.863: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
Sep 28 23:59:54.865: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
Sep 28 23:59:56.868: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
Sep 28 23:59:58.870: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
...
Sep 29 00:04:49.731: INFO: PersistentVolumeClaim pvc-q84nb found but phase is Pending instead of Bound.
Sep 29 00:04:51.731: INFO: deleting claim "e2e-tests-csi-mock-plugin-tbmdn"/"pvc-q84nb"
Sep 29 00:04:51.735: INFO: deleting storage class e2e-tests-csi-mock-plugin-tbmdn-sc
[AfterEach] CSI plugin test using CSI driver: hostPath
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:92
...
STEP: Destroying namespace "e2e-tests-csi-mock-plugin-tbmdn" for this suite.
Sep 29 00:05:03.920: INFO: Waiting up to 30s for server preferred namespaced resources to be successfully discovered
Sep 29 00:05:03.981: INFO: namespace: e2e-tests-csi-mock-plugin-tbmdn, resource: bindings, ignored listing per whitelist
Sep 29 00:05:03.983: INFO: namespace e2e-tests-csi-mock-plugin-tbmdn deletion completed in 6.074559175s

~ Failure [331.436 seconds]
[sig-storage] CSI Volumes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/utils/framework.go:22
  CSI plugin test using CSI driver: hostPath
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:82
    should provision storage [It]
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:96

    Expected error:
        <*errors.errorString | 0xc4204a7e80>: {
            s: "PersistentVolumeClaim pvc-q84nb not in phase Bound within 5m0s",
        }
        PersistentVolumeClaim pvc-q84nb not in phase Bound within 5m0s
    not to have occurred

    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/volume_provisioning.go:92
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSep 29 00:05:03.984: INFO: Running AfterSuite actions on all node
Sep 29 00:05:03.984: INFO: Running AfterSuite actions on node 1

Summarizing 1 Failure:

[Fail] [sig-storage] CSI Volumes CSI plugin test using CSI driver: hostPath [It] should provision storage
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/volume_provisioning.go:92

Ran 1 of 998 Specs in 337.721 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 997 Skipped --- FAIL: TestE2E (337.75s)
oomichi commented 5 years ago
 96                         It("should provision storage", func() {
 97                                 t := driver.createStorageClassTest(node)
 98                                 claim := newClaim(t, ns.GetName(), "")
 99                                 class := newStorageClass(t, ns.GetName(), "")
100                                 claim.Spec.StorageClassName = &class.ObjectMeta.Name
101                                 testDynamicProvisioning(t, cs, claim, class)
102                         })

test/e2e/storage/volume_provisioning.go

  68 func testDynamicProvisioning(t storageClassTest, client clientset.Interface, claim *v1.PersistentVolumeClaim, class *storage.StorageClass) *v1.PersistentVolume {
  69         var err error
  70         if class != nil {
  71                 By("creating a StorageClass " + class.Name)
  72                 class, err = client.StorageV1().StorageClasses().Create(class)
  73                 Expect(err).NotTo(HaveOccurred())
  74                 defer func() {
  75                         framework.Logf("deleting storage class %s", class.Name)
  76                         framework.ExpectNoError(client.StorageV1().StorageClasses().Delete(class.Name, nil))
  77                 }()
  78         }
  79
  80         By("creating a claim")
  81         claim, err = client.CoreV1().PersistentVolumeClaims(claim.Namespace).Create(claim)
  82         Expect(err).NotTo(HaveOccurred())
  83         defer func() {
  84                 framework.Logf("deleting claim %q/%q", claim.Namespace, claim.Name)
  85                 // typically this claim has already been deleted
  86                 err = client.CoreV1().PersistentVolumeClaims(claim.Namespace).Delete(claim.Name, nil)
  87                 if err != nil && !apierrs.IsNotFound(err) {
  88                         framework.Failf("Error deleting claim %q. Error: %v", claim.Name, err)
  89                 }
  90         }()
  91         err = framework.WaitForPersistentVolumeClaimPhase(v1.ClaimBound, client, claim.Namespace, claim.Name, framework.Poll, framework.ClaimProvisionTimeout)
ここでタイムアウト
oomichi commented 5 years ago

pvc の状態

$ kubectl describe pvc pvc-kpvng -n e2e-tests-csi-mock-plugin-f8756
Name:          pvc-kpvng
Namespace:     e2e-tests-csi-mock-plugin-f8756
StorageClass:  e2e-tests-csi-mock-plugin-f8756-sc
Status:        Pending
Volume:
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-provisioner=csi-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  ExternalProvisioning  10s (x3 over 20s)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "csi-hostpath" or manually created by system administrator

StorageClassの状態

$ kubectl describe storageclass e2e-tests-csi-mock-plugin-f8756-sc
Name:                  e2e-tests-csi-mock-plugin-f8756-sc
IsDefaultClass:        No
Annotations:           <none>
Provisioner:           csi-hostpath
Parameters:            <none>
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>
oomichi commented 5 years ago

テストの目的を把握する。 test/e2e/storage/csi_volumes.go

oomichi commented 5 years ago

ログからすると、StorageClass で指定された provisiner の "csi-hostpath" が環境に存在しないため。 Cinder Standalone の場合はStorageClass で

provisioner: openstack.org/standalone-cinder

を指定した。これは OpenStack external cloud-provider にハードコードされている文字列。 "csi-hostpath" はどこにある? test/e2e/storage/csi_objects.go にそれらしい external-provisioner が存在する。

194 func csiHostPathPod(
195         client clientset.Interface,
196         config framework.VolumeTestConfig,
197         teardown bool,
198         f *framework.Framework,
199         sa *v1.ServiceAccount,
200 ) *v1.Pod {
201         podClient := client.CoreV1().Pods(config.Namespace)
202
203         priv := true
204         mountPropagation := v1.MountPropagationBidirectional
205         hostPathType := v1.HostPathDirectoryOrCreate
206         pod := &v1.Pod{
207                 ObjectMeta: metav1.ObjectMeta{
208                         Name:      config.Prefix + "-pod",
209                         Namespace: config.Namespace,
210                         Labels: map[string]string{
211                                 "app": "hostpath-driver",
212                         },
213                 },
214                 Spec: v1.PodSpec{
215                         ServiceAccountName: sa.GetName(),
216                         NodeName:           config.ServerNodeName,
217                         RestartPolicy:      v1.RestartPolicyNever,
218                         Containers: []v1.Container{
219                                 {
220                                         Name:            "external-provisioner",
221                                         Image:           csiContainerImage("csi-provisioner"),
222                                         ImagePullPolicy: v1.PullAlways,
223                                         Args: []string{
224                                                 "--v=5",
225                                                 "--provisioner=csi-hostpath",
226                                                 "--csi-address=/csi/csi.sock",
227                                         },

このメソッド csiHostPathPod は e2e/storage/csi_volumes.go の createCSIDriver で呼ばれる。 この createCSIDriver は各テスト実施前に BeforeEach で呼ばれる。

 87                         BeforeEach(func() {
 88                                 driver = curInitCSIDriver(f, config)
 89                                 driver.createCSIDriver()
 90                         })

このPod 作成が成功していない?

oomichi commented 5 years ago

csiHostPathPod のつくりが雑・・ 最初に有無を言わさず CSIDriver Pod の削除を行い(残っていた場合に備えて)、 その後で作成処理を行っている。 削除の下記のログは残っているが作成のログがない。まずはそれを追加したほうがよい。

ESC[1mSTEPESC[0m: Creating the CSI driver registrar cluster role
[BeforeEach] CSI plugin test using CSI driver: hostPath
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:87
ESC[1mSTEPESC[0m: deploying csi hostpath driver
ESC[1mSTEPESC[0m: Creating a CSI service account for hostpath
ESC[1mSTEPESC[0m: Binding cluster roles [system:csi-external-attacher system:csi-external-provisioner csi-driver-registrar] to the CSI service account csi-hostpath-service-account
Sep 27 21:43:54.259: INFO: Deleting pod "csi-pod" in namespace "e2e-tests-csi-mock-plugin-8pfph"
★上記
[It] should provision storage
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:96
ESC[1mSTEPESC[0m: creating a StorageClass e2e-tests-csi-mock-plugin-8pfph-sc
ESC[1mSTEPESC[0m: creating a claim
Sep 27 21:44:32.316: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-28m5j to have phase Bound
Sep 27 21:44:32.338: INFO: PersistentVolumeClaim pvc-28m5j found but phase is Pending instead of Bound.
oomichi commented 5 years ago
$ kubectl get pods -n e2e-tests-csi-mock-plugin-m2q7c
NAME      READY     STATUS    RESTARTS   AGE
csi-pod   1/4       Error     0          1m
$ kubectl describe pod csi-pod -n e2e-tests-csi-mock-plugin-m
2q7c
Name:               csi-pod
Namespace:          e2e-tests-csi-mock-plugin-m2q7c
Priority:           0
PriorityClassName:  <none>
Node:               113-node01/192.168.1.103
Start Time:         Tue, 02 Oct 2018 18:45:05 +0000
Labels:             app=hostpath-driver
Annotations:        <none>
Status:             Running
IP:                 10.244.1.198
Containers:
  external-provisioner:
    Container ID:  docker://a26d1d0aabfe485e3c1308f2e5205b2dae1537c0fba09345ae48bb028b787f7c
    Image:         quay.io/k8scsi/csi-provisioner:v0.2.1
    Image ID:      docker://sha256:9af643765c4918cb3659f1048c9d0692f40a2f79a3c3ec8a97d99c3df119d591
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=5
      --provisioner=csi-hostpath
      --csi-address=/csi/csi.sock
    State:          Running
      Started:      Tue, 02 Oct 2018 18:45:09 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-hostpath-service-account-token-v5gvm (ro)
  driver-registrar:
    Container ID:  docker://985555b62478fc2e5543d219038dad02f1b4b9d4b218285deb74f0f056a46d3d
    Image:         quay.io/k8scsi/driver-registrar:v0.2.0
    Image ID:      docker://sha256:6d377cda6069c60d3371db9a2fbe9f320f72be30146297d87fda47ecb50c20b6
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=5
      --csi-address=/csi/csi.sock
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 02 Oct 2018 18:45:10 +0000
      Finished:     Tue, 02 Oct 2018 18:46:10 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-hostpath-service-account-token-v5gvm (ro)
  external-attacher:
    Container ID:  docker://66ac54a64c44484a6a8e0735af55a62b4cae013cd61ddc0ffe008efb9bc9177c
    Image:         quay.io/k8scsi/csi-attacher:v0.2.0
    Image ID:      docker://sha256:b185a6b14ad40fad089ba85fa691fbca6e3cbf543c5a0f151f94857a1050c352
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=5
      --csi-address=$(ADDRESS)
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 02 Oct 2018 18:45:11 +0000
      Finished:     Tue, 02 Oct 2018 18:46:11 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      ADDRESS:  /csi/csi.sock
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-hostpath-service-account-token-v5gvm (ro)
  hostpath-driver:
    Container ID:  docker://894255e702a4b837ff878c8e0b6eeb64b600ed1ad442c60ee3f736e838e85629
    Image:         quay.io/k8scsi/hostpathplugin:v0.2.0
    Image ID:      docker://sha256:5e3cc25175a780dc7c24f2cff404ceaa7bdb8ad62db6b81556a6d203a0f787d3
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=5
      --endpoint=$(CSI_ENDPOINT)
      --nodeid=$(KUBE_NODE_NAME)
    State:          Terminated
      Reason:       ContainerCannotRun
      Message:      linux mounts: Path /var/lib/kubelet/pods is mounted on / but it is not a shared mount.
      Exit Code:    128
      Started:      Tue, 02 Oct 2018 18:45:12 +0000
      Finished:     Tue, 02 Oct 2018 18:45:12 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      CSI_ENDPOINT:    unix:///csi/csi.sock
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /csi from socket-dir (rw)
      /var/lib/kubelet/pods from mountpoint-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-hostpath-service-account-token-v5gvm (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  socket-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/csi-hostpath
    HostPathType:  DirectoryOrCreate
  mountpoint-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/pods
    HostPathType:  DirectoryOrCreate
  csi-hostpath-service-account-token-v5gvm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  csi-hostpath-service-account-token-v5gvm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age   From                 Message
  ----     ------   ----  ----                 -------
  Normal   Pulling  2m    kubelet, 113-node01  pulling image "quay.io/k8scsi/csi-provisioner:v0.2.1"
  Normal   Pulled   2m    kubelet, 113-node01  Successfully pulled image "quay.io/k8scsi/csi-provisioner:v0.2.1"
  Normal   Created  2m    kubelet, 113-node01  Created container
  Normal   Started  2m    kubelet, 113-node01  Started container
  Normal   Pulling  2m    kubelet, 113-node01  pulling image "quay.io/k8scsi/driver-registrar:v0.2.0"
  Normal   Pulled   2m    kubelet, 113-node01  Successfully pulled image "quay.io/k8scsi/driver-registrar:v0.2.0"
  Normal   Created  2m    kubelet, 113-node01  Created container
  Normal   Started  2m    kubelet, 113-node01  Started container
  Normal   Pulling  2m    kubelet, 113-node01  pulling image "quay.io/k8scsi/csi-attacher:v0.2.0"
  Normal   Pulled   2m    kubelet, 113-node01  Successfully pulled image "quay.io/k8scsi/csi-attacher:v0.2.0"
  Normal   Created  2m    kubelet, 113-node01  Created container
  Normal   Started  2m    kubelet, 113-node01  Started container
  Normal   Pulling  2m    kubelet, 113-node01  pulling image "quay.io/k8scsi/hostpathplugin:v0.2.0"
  Normal   Pulled   2m    kubelet, 113-node01  Successfully pulled image "quay.io/k8scsi/hostpathplugin:v0.2.0"
  Normal   Created  2m    kubelet, 113-node01  Created container
  Warning  Failed   2m    kubelet, 113-node01  Error: failed to start container "hostpath-driver": Error response from daemon: linux mounts: Path /var/lib/kubelet/pods is mounted on / but it is not a shared mount.
oomichi commented 5 years ago

Error: failed to start container "hostpath-driver": Error response from daemon: linux mounts: Path /var/lib/kubelet/pods is mounted on / but it is not a shared mount.

以下を 113-node01 (唯一の実行ノード)で実行してみた

$ sudo mount -o bind /var/lib/kubelet /var/lib/kubelet
$ sudo mount --make-shared /var/lib/kubelet

kubectl describe pod で表示されるエラーが以下のように変化した。

  Warning  Failed   1m    kubelet, 113-node01  Error: failed to start container "hostpath-driver": Error response from daemon: linux mounts: Path /var/lib/kubelet/pods is mounted on /var/lib/kubelet but it is not a shared mount.

最後の --make-shared が有効になっていないように見える。

/etc/systemd/system/multi-user.target.wants/docker.service

[Service] 
- MountFlags=slave
+ MountFlags=shared

反映のため、システム再起動。 テストが成功するようになった。