kubernetes-csi / external-provisioner

Sidecar container that watches Kubernetes PersistentVolumeClaim objects and triggers CreateVolume/DeleteVolume against a CSI endpoint
Apache License 2.0
337 stars 323 forks source link

Pod is not created under selected zone of Volume #150

Closed leakingtapan closed 5 years ago

leakingtapan commented 5 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

What happened: I am testing dynamic provisioning with EBS CSI driver with delayed binding. Most of the time the pod is created under the same zone as volume. There is one time that pod creation is failed because of pod is created in a different zone to volume's zone.

What you expected to happen: Volume and pod should always be created under the same topology domain with volume scheduling enabled.

How to reproduce it (as minimally and precisely as possible): Non-deterministic so far

Anything else we need to know?: Provisioner log:

I1015 20:30:34.630580       1 controller.go:991] provision "default/late-claim" class "late-sc": started
I1015 20:30:34.643414       1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginCapabilities
I1015 20:30:34.643430       1 controller.go:122] GRPC request:
I1015 20:30:34.643634       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"1d44c823-d0b9-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"1694", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/late-claim"
I1015 20:30:34.644379       1 controller.go:124] GRPC response: capabilities:<service:<type:CONTROLLER_SERVICE > > capabilities:<service:<type:ACCESSIBILITY_CONSTRAINTS > >
I1015 20:30:34.644443       1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.644453       1 controller.go:121] GRPC call: /csi.v0.Controller/ControllerGetCapabilities
I1015 20:30:34.644459       1 controller.go:122] GRPC request:
I1015 20:30:34.645083       1 controller.go:124] GRPC response: capabilities:<rpc:<type:CREATE_DELETE_VOLUME > > capabilities:<rpc:<type:PUBLISH_UNPUBLISH_VOLUME > >
I1015 20:30:34.645139       1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.645151       1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginInfo
I1015 20:30:34.645190       1 controller.go:122] GRPC request:
I1015 20:30:34.645621       1 controller.go:124] GRPC response: name:"com.amazon.aws.csi.ebs" vendor_version:"0.0.1"
I1015 20:30:34.645658       1 controller.go:125] GRPC error: <nil>
I1015 20:30:34.661737       1 controller.go:428] CreateVolumeRequest {Name:pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798 CapacityRange:required_bytes:4294967296  VolumeCapabilities:[mount:<> access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[] ControllerCreateSecrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > >  XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 20:30:34.661862       1 controller.go:121] GRPC call: /csi.v0.Controller/CreateVolume
I1015 20:30:34.661868       1 controller.go:122] GRPC request: name:"pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" capacity_range:<required_bytes:4294967296 > volume_capabilities:<mount:<> access_mode:<mode:SINGLE_NODE_WRITER > > accessibility_requirements:<requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > >
I1015 20:30:34.841760       1 leaderelection.go:227] successfully renewed lease default/com.amazon.aws.csi.ebs
I1015 20:30:35.114422       1 controller.go:124] GRPC response: volume:<capacity_bytes:4294967296 id:"vol-0c696d140008a61a8" accessible_topology:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > >
I1015 20:30:35.114527       1 controller.go:125] GRPC error: <nil>
I1015 20:30:35.114540       1 controller.go:484] create volume rep: {CapacityBytes:4294967296 Id:vol-0c696d140008a61a8 Attributes:map[] ContentSource:<nil> AccessibleTopology:[segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 20:30:35.114631       1 controller.go:546] successfully created PV {GCEPersistentDisk:nil AWSElasticBlockStore:nil HostPath:nil Glusterfs:nil NFS:nil RBD:nil ISCSI:nil Cinder:nil CephFS:nil FC:nil Flocker:nil FlexVolume:nil AzureFile:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil PortworxVolume:nil ScaleIO:nil Local:nil StorageOS:nil CSI:&CSIPersistentVolumeSource{Driver:com.amazon.aws.csi.ebs,VolumeHandle:vol-0c696d140008a61a8,ReadOnly:false,FSType:ext4,VolumeAttributes:map[string]string{storage.kubernetes.io/csiProvisionerIdentity: 1539635296092-8081-com.amazon.aws.csi.ebs,},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,}}
I1015 20:30:35.114740       1 controller.go:1091] provision "default/late-claim" class "late-sc": volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" provisioned
I1015 20:30:35.114760       1 controller.go:1105] provision "default/late-claim" class "late-sc": trying to save persistentvvolume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798"
I1015 20:30:35.134870       1 controller.go:1112] provision "default/late-claim" class "late-sc": persistentvolume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" saved
I1015 20:30:35.134930       1 controller.go:1153] provision "default/late-claim" class "late-sc": succeeded
I1015 20:30:35.135246       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"1d44c823-d0b9-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"1694", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798

EBS Driver Log:

I1015 20:28:16.294223       1 driver.go:52] Driver: com.amazon.aws.csi.ebs
I1015 20:28:16.294360       1 mount_linux.go:199] Detected OS without systemd
I1015 20:28:16.294928       1 driver.go:107] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
I1015 20:30:34.644708       1 controller.go:175] ControllerGetCapabilities: called with args &csi.ControllerGetCapabilitiesRequest{XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
I1015 20:30:34.662445       1 controller.go:31] CreateVolume: called with args &csi.CreateVolumeRequest{Name:"pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798", CapacityRange:(*csi.CapacityRange)(0xc0001ab560), VolumeCapabilities:[]*csi.VolumeCapability{(*csi.VolumeCapability)(0xc0001b0b80)}, Parameters:map[string]string(nil), ControllerCreateSecrets:map[string]string(nil), VolumeContentSource:(*csi.VolumeContentSource)(nil), AccessibilityRequirements:(*csi.TopologyRequirement)(0xc0001d78b0), XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}

POD event:

Name:               app
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               ip-172-20-127-156.ec2.internal/172.20.127.156
Start Time:         Mon, 15 Oct 2018 13:30:35 -0700
Labels:             <none>
Annotations:        kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container app
Status:             Pending
IP:
Containers:
  app:
    Container ID:
    Image:         centos
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      while true; do echo $(date -u) >> /data/out.txt; sleep 5; done
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        100m
    Environment:  <none>
    Mounts:
      /data from persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rw8jc (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  late-claim
    ReadOnly:   false
  default-token-rw8jc:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rw8jc
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason              Age   From                     Message
  ----     ------              ----  ----                     -------
  Normal   Scheduled           18s   default-scheduler        Successfully assigned default/app to ip-172-20-127-156.ec2.internal
  Warning  FailedAttachVolume  17s   attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Co
uld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatc
h: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: 33c4a9cc-37d1-4e78-b37b-c9df81f659f9
  Warning  FailedAttachVolume  17s  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: 8baabab8-e0e1-4063-9107-ea86cb7c9fda
  Warning  FailedAttachVolume  16s  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: fff0faa0-df0e-4ad8-af27-0483267b09f7
  Warning  FailedAttachVolume  14s  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: 2b03cea9-1ccb-4f65-91f8-bca33dab29f1
  Warning  FailedAttachVolume  10s  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Cou
ld not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch
: The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: 8b1129ab-1289-493a-a02b-981aa9d9478f
  Warning  FailedAttachVolume  2s  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-1d44c823-d0b9-11e8-81f1-0a75e9a76798" : rpc error: code = Internal desc = Coul
d not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": could not attach volume "vol-0c696d140008a61a8" to node "i-0bf114cd21779ff49": InvalidVolume.ZoneMismatch:
 The volume 'vol-0c696d140008a61a8' is not in the same availability zone as instance 'i-0bf114cd21779ff49'
           status code: 400, request id: 3a1f317d-8240-4f16-99ff-12982b1d673c
>> cat late-bind-sc.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: late-sc
provisioner: com.amazon.aws.csi.ebs
volumeBindingMode: WaitForFirstConsumer
>> cat late-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: late-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: late-sc
  resources:
    requests:
      storage: 4Gi
>> cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: late-claim

Environment:

leakingtapan commented 5 years ago

/cc msau42 /assign verult

I have see this once so far and am wondering how is preference set in topology requirement?

k8s-ci-robot commented 5 years ago

@leakingtapan: GitHub didn't allow me to assign the following users: verult.

Note that only kubernetes-csi members and repo collaborators can be assigned. For more information please see the contributor guide

In response to [this](https://github.com/kubernetes-csi/external-provisioner/issues/150#issuecomment-430031164): >/cc msau42 >/assign verult > >I have see this once so far and am wondering how is preference set in topology requirement? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
msau42 commented 5 years ago

/assign @verult

It looks like topology preferences are not set even though WaitForFirstConsumer is enabled in the StorageClass.

@leakingtapan can you also list out the feature gates you set on Kubernetes and csi-external-provisioner?

k8s-ci-robot commented 5 years ago

@msau42: GitHub didn't allow me to assign the following users: verult.

Note that only kubernetes-csi members and repo collaborators can be assigned. For more information please see the contributor guide

In response to [this](https://github.com/kubernetes-csi/external-provisioner/issues/150#issuecomment-430032912): >/assign @verult > >It looks like topology preferences are not set even though WaitForFirstConsumer is enabled in the StorageClass. > >@leakingtapan can you also list out the feature gates you set on Kubernetes and csi-external-provisioner? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
msau42 commented 5 years ago

cc @verult @ddebroy

verult commented 5 years ago

ACK will take a look

verult commented 5 years ago

@leakingtapan in the reproductions where provisioning is successful, do you see a Preferred field in the CreateVolumeRequest() log (should be right after requisite)?

leakingtapan commented 5 years ago

Here is the log where it succeeds:

I1015 21:36:01.390494       1 controller.go:991] provision "default/late-claim" class "late-sc": started
I1015 21:36:01.405928       1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginCapabilities
I1015 21:36:01.405941       1 controller.go:122] GRPC request:
I1015 21:36:01.406243       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"4a34733f-d0c2-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"19029", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/late-claim"
I1015 21:36:01.406605       1 controller.go:124] GRPC response: capabilities:<service:<type:CONTROLLER_SERVICE > > capabilities:<service:<type:ACCESSIBILITY_CONSTRAINTS > >
I1015 21:36:01.406643       1 controller.go:125] GRPC error: <nil>
I1015 21:36:01.406667       1 controller.go:121] GRPC call: /csi.v0.Controller/ControllerGetCapabilities
I1015 21:36:01.406674       1 controller.go:122] GRPC request:
I1015 21:36:01.407000       1 controller.go:124] GRPC response: capabilities:<rpc:<type:CREATE_DELETE_VOLUME > > capabilities:<rpc:<type:PUBLISH_UNPUBLISH_VOLUME > >
I1015 21:36:01.407033       1 controller.go:125] GRPC error: <nil>
I1015 21:36:01.407045       1 controller.go:121] GRPC call: /csi.v0.Identity/GetPluginInfo
I1015 21:36:01.407050       1 controller.go:122] GRPC request:
I1015 21:36:01.407312       1 controller.go:124] GRPC response: name:"com.amazon.aws.csi.ebs" vendor_version:"0.0.1"
I1015 21:36:01.407358       1 controller.go:125] GRPC error: <nil>
I1015 21:36:01.415984       1 controller.go:428] CreateVolumeRequest {Name:pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798 CapacityRange:required_bytes:4294967296  VolumeCapabilities:[mount:<> access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[] ControllerCreateSecrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > >  XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 21:36:01.416051       1 controller.go:121] GRPC call: /csi.v0.Controller/CreateVolume
I1015 21:36:01.416057       1 controller.go:122] GRPC request: name:"pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798" capacity_range:<required_bytes:4294967296 > volume_capabilities:<mount:<> access_mode:<mode:SINGLE_NODE_WRITER > > accessibility_requirements:<requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1a" > > requisite:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1b" > > >
I1015 21:36:01.774456       1 controller.go:124] GRPC response: volume:<capacity_bytes:4294967296 id:"vol-0f1667fc673302e81" accessible_topology:<segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > > >
I1015 21:36:01.774530       1 controller.go:125] GRPC error: <nil>
I1015 21:36:01.774543       1 controller.go:484] create volume rep: {CapacityBytes:4294967296 Id:vol-0f1667fc673302e81 Attributes:map[] ContentSource:<nil> AccessibleTopology:[segments:<key:"com.amazon.aws.csi.ebs/zone" value:"us-east-1c" > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1015 21:36:01.774576       1 controller.go:546] successfully created PV {GCEPersistentDisk:nil AWSElasticBlockStore:nil HostPath:nil Glusterfs:nil NFS:nil RBD:nil ISCSI:nil Cinder:nil CephFS:nil FC:nil Flocker:nil FlexVolume:nil AzureFile:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil PortworxVolume:nil ScaleIO:nil Local:nil StorageOS:nil CSI:&CSIPersistentVolumeSource{Driver:com.amazon.aws.csi.ebs,VolumeHandle:vol-0f1667fc673302e81,ReadOnly:false,FSType:ext4,VolumeAttributes:map[string]string{storage.kubernetes.io/csiProvisionerIdentity: 1539635296092-8081-com.amazon.aws.csi.ebs,},ControllerPublishSecretRef:nil,NodeStageSecretRef:nil,NodePublishSecretRef:nil,}}
I1015 21:36:01.774624       1 controller.go:1091] provision "default/late-claim" class "late-sc": volume "pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798" provisioned
I1015 21:36:01.774657       1 controller.go:1105] provision "default/late-claim" class "late-sc": trying to save persistentvvolume "pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798"
I1015 21:36:01.794182       1 controller.go:1112] provision "default/late-claim" class "late-sc": persistentvolume "pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798" saved
I1015 21:36:01.794214       1 controller.go:1153] provision "default/late-claim" class "late-sc": succeeded
I1015 21:36:01.794300       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"late-claim", UID:"4a34733f-d0c2-11e8-81f1-0a75e9a76798", APIVersion:"v1", ResourceVersion:"19029", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-4a34733f-d0c2-11e8-81f1-0a75e9a76798

It didn't contain preferred too

leakingtapan commented 5 years ago

can you also list out the feature gates you set on Kubernetes and csi-external-provisioner?

I enabled CSIDriverRegistry and CSINodeInfo for kubelet and kube-apiserver.

Args for external provisioner:

          image: quay.io/k8scsi/csi-provisioner:v0.4.0
          args:
            - --provisioner=com.amazon.aws.csi.ebs
            - --csi-address=$(ADDRESS)
            - --v=5
verult commented 5 years ago

That's strange... after creating the PVC and the pod, do you see a selected-node annotation in the PVC object?

leakingtapan commented 5 years ago

This information is unfortunately not available now since I didn't describe the pvc at the time when this bug happened. And after that I retried pod creation and everything works again. I need to reproduce this in order to find out.

leakingtapan commented 5 years ago

For the working case, selected-node does show up with correct assigned node though.

msau42 commented 5 years ago

To more easily repro the issue, you can try creating multiple PVCs in one Pod

msau42 commented 5 years ago

(maybe like 10+ PVCs)

verult commented 5 years ago

I did a test with the latest of k8s release-1.12 branch (cluster started with AllAlpha=true) and external-provisioner 0.4.0 (with --feature-gates=Topology=true), and the provisioner did call the CSI driver with preferred:

I1018 22:30:45.278889       1 controller.go:432] CreateVolumeRequest {Name:pvc-74359b21-d325-11e8-a631-42010a800002 CapacityRange:required_bytes:6442450944  VolumeCapabilities:[mount:<> access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:pd-standard] ControllerCreateSecrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"com.google.topology/zone" value:"us-central1-b" > > preferred:<segments:<key:"com.google.topology/zone" value:"us-central1-b" > >  XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}

external-attacher and driver-registrar are also at 0.4.0. This is a single-zone cluster.

Steps:

StorageClass:

apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: csi-gce-pd
provisioner: com.google.csi.gcepd
parameters:
  type: pd-standard                                                                       
volumeBindingMode: WaitForFirstConsumer

PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: podpvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: csi-gce-pd
  resources:
    requests:
      storage: 6Gi

Pod:

apiVersion: v1
kind: Pod
metadata:
  name: web-server
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - mountPath: /var/lib/www/html
         name: mypvc
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: podpvc
       readOnly: false

BTW looks like the csi-provisioner 0.4.0 image was updated 2 days ago...

msau42 commented 5 years ago

@verult do you mean 0.4.1?

msau42 commented 5 years ago

Can you try a multi-zone cluster?

leakingtapan commented 5 years ago

After upgrading to v0.4.1 for provisioner/attacher/registrar, I can see preferred too.

msau42 commented 5 years ago

Thanks, can we close this issue then?

leakingtapan commented 5 years ago

I haven't seem this issue again. /close

k8s-ci-robot commented 5 years ago

@leakingtapan: Closing this issue.

In response to [this](https://github.com/kubernetes-csi/external-provisioner/issues/150#issuecomment-434414803): >I haven't seem this issue again. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.