openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
735 stars 106 forks source link

Unable to attach or mount volumes on AWS EKS #306

Closed FestivalBobcats closed 4 years ago

FestivalBobcats commented 4 years ago

Describe the bug This affects both v0.1.0 and v0.2.0. Following the steps outlined in the deploy README, I'm able to create the necessary Mayastor services, but volumes fail to mount with the following error:

MountVolume.MountDevice failed for volume "pvc-fb3242d3-4c69-4ddd-a429-21a50e2ade76" : rpc error: code = Internal desc = Failed to mount ext4 fs on /dev/sda with opts "" to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-fb3242d3-4c69-4ddd-a429-21a50e2ade76/globalmount: No such device (os error 19)
Unable to attach or mount volumes: unmounted volumes=[ms-volume], unattached volumes=[ms-volume default-token-6qqk4]: timed out waiting for the condition

To Reproduce

cat <<EOF | kubectl apply -f - apiVersion: "openebs.io/v1alpha1" kind: MayastorPool metadata: name: pool-on-node-1 namespace: mayastor spec: node: $NODE_NAME disks: ["$DISK_NAME"] EOF

```yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: mayastor-iscsi
parameters:
  repl: '1'
  protocol: 'iscsi'
provisioner: io.openebs.csi-mayastor
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ms-volume-claim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: mayastor-iscsi
kind: Pod
apiVersion: v1
metadata:
  name: fio
spec:
  volumes:
    - name: ms-volume
      persistentVolumeClaim:
       claimName: ms-volume-claim
  containers:
    - name: fio
      image: dmonakhov/alpine-fio
      args:
        - sleep
        - "1000000"
      volumeMounts:
        - mountPath: "/volume"
          name: ms-volume

Expected behavior Expect volume to mount, container to create, Pod to start.

OS info (please complete the following information):

jkryl commented 4 years ago

Hi @FestivalBobcats ! Thanks for submitting the ticket. Before we try to reproduce your issue on AWS (we don't regularly test AWS (although that in future we plan to do that), could you share logfile from mayastor-grpc container in mayastor pod? This container is responsible for mounting iscsi volumes. Also if there are any error/warning messages in other mayastor log files (moac and mayastor pods being the most important ones), it would be good to send them our way. As a workaround, you could try to use NBD instead of iscsi (different storage class).

FestivalBobcats commented 4 years ago

@jkryl

mayastor-grpc:

[2020-07-04T15:58:08Z INFO  mayastor_agent] Agent bound to CSI at /csi/csi.sock
[2020-07-04T15:58:11Z DEBUG mayastor_agent::identity] GetPluginInfo request (io.openebs.csi-mayastor:0.1)
[2020-07-04T15:58:12Z DEBUG mayastor_agent::node] NodeGetInfo request: ID=mayastor://ip-10-0-1-198.us-west-2.compute.internal/10.0.1.198:10124, max volumes=0
[2020-07-04T16:01:07Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-04T16:01:07Z DEBUG mayastor_agent::node] Staging volume ddae463c-d2f8-42aa-9626-7abc6045de08 to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-ddae463c-d2f8-42aa-9626-7abc6045de08/globalmount
[2020-07-04T16:01:07Z DEBUG mayastor_agent::node] URI is iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:07Z DEBUG mayastor_agent::node::iscsiutil] Using /bin/mayastor-iscsiadm for iscsiadm
[2020-07-04T16:01:09Z DEBUG mayastor_agent::node] device_path is /dev/sda for uri iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:09Z DEBUG mayastor_agent::format] Formatting device /dev/sda with a ext4 filesystem
[2020-07-04T16:01:09Z INFO  mayastor_agent::format] Device /dev/sda formatted with ext4 filesystem
[2020-07-04T16:01:09Z DEBUG mayastor_agent::mount] Mounting /dev/sda ...
[2020-07-04T16:01:10Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-04T16:01:10Z DEBUG mayastor_agent::node] Staging volume ddae463c-d2f8-42aa-9626-7abc6045de08 to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-ddae463c-d2f8-42aa-9626-7abc6045de08/globalmount
[2020-07-04T16:01:10Z DEBUG mayastor_agent::node] URI is iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:10Z DEBUG mayastor_agent::node] device_path is /dev/sda for uri iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:10Z INFO  mayastor_agent::format] Skipping format: device /dev/sda contains a preexisting ext4 filesystem
[2020-07-04T16:01:10Z DEBUG mayastor_agent::mount] Mounting /dev/sda ...
[2020-07-04T16:01:11Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-04T16:01:11Z DEBUG mayastor_agent::node] Staging volume ddae463c-d2f8-42aa-9626-7abc6045de08 to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-ddae463c-d2f8-42aa-9626-7abc6045de08/globalmount
[2020-07-04T16:01:11Z DEBUG mayastor_agent::node] URI is iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:11Z DEBUG mayastor_agent::node] device_path is /dev/sda for uri iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
[2020-07-04T16:01:11Z INFO  mayastor_agent::format] Skipping format: device /dev/sda contains a preexisting ext4 filesystem

moac csi-provisioner:

I0704 15:57:37.929928       1 feature_gate.go:243] feature gates: &{map[Topology:true]}
I0704 15:57:37.929985       1 csi-provisioner.go:107] Version: v1.6.0-0-g321fa5c1c
I0704 15:57:37.930003       1 csi-provisioner.go:121] Building kube configs for running in cluster...
I0704 15:57:37.936333       1 connection.go:153] Connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
W0704 15:57:47.936487       1 connection.go:172] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
I0704 15:57:56.075797       1 common.go:111] Probing CSI driver for readiness
I0704 15:57:56.077416       1 csi-provisioner.go:163] Detected CSI driver io.openebs.csi-mayastor
W0704 15:57:56.077425       1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.
I0704 15:57:56.079159       1 controller.go:709] Using saving PVs to API server in background
I0704 15:57:56.179333       1 controller.go:799] Starting provisioner controller io.openebs.csi-mayastor_moac-5cfb86798-kjzzf_a9369a2a-6c43-4db9-a468-c7a0663efb52!
I0704 15:57:56.179384       1 clone_controller.go:58] Starting CloningProtection controller
I0704 15:57:56.179525       1 clone_controller.go:74] Started CloningProtection controller
I0704 15:57:56.179963       1 volume_store.go:97] Starting save volume queue
I0704 15:57:56.281885       1 controller.go:848] Started provisioner controller io.openebs.csi-mayastor_moac-5cfb86798-kjzzf_a9369a2a-6c43-4db9-a468-c7a0663efb52!
I0704 16:01:02.781734       1 controller.go:1284] provision "default/ms-volume-claim" class "mayastor-iscsi": started
W0704 16:01:02.785339       1 topology.go:343] No topology keys found on any node
I0704 16:01:02.786413       1 event.go:281] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ms-volume-claim", UID:"ddae463c-d2f8-42aa-9626-7abc6045de08", APIVersion:"v1", ResourceVersion:"8998", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/ms-volume-claim"
I0704 16:01:02.857431       1 controller.go:726] successfully created PV pvc-ddae463c-d2f8-42aa-9626-7abc6045de08 for PVC ms-volume-claim and csi volume name ddae463c-d2f8-42aa-9626-7abc6045de08
I0704 16:01:02.857637       1 controller.go:1392] provision "default/ms-volume-claim" class "mayastor-iscsi": volume "pvc-ddae463c-d2f8-42aa-9626-7abc6045de08" provisioned
I0704 16:01:02.857796       1 controller.go:1409] provision "default/ms-volume-claim" class "mayastor-iscsi": succeeded
I0704 16:01:02.868133       1 controller.go:1284] provision "default/ms-volume-claim" class "mayastor-iscsi": started
I0704 16:01:02.872878       1 event.go:281] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ms-volume-claim", UID:"ddae463c-d2f8-42aa-9626-7abc6045de08", APIVersion:"v1", ResourceVersion:"8998", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-ddae463c-d2f8-42aa-9626-7abc6045de08
I0704 16:01:02.887073       1 controller.go:1293] provision "default/ms-volume-claim" class "mayastor-iscsi": persistentvolume "pvc-ddae463c-d2f8-42aa-9626-7abc6045de08" already exists, skipping

moac:

(node:1) Warning: mayastor.proto not found in any of the include paths ./node_modules/protobufjs
Jul 04 15:57:54.983 info [csi]: CSI server listens at /var/lib/csi/sockets/pluginproxy/csi.sock
Jul 04 15:57:54.988 debug [nats]: Connecting to NATS at "nats" ...
Jul 04 15:57:54.995 debug: Loading openAPI spec from the server
Jul 04 15:57:55.020 info [nats]: Connected to NATS message bus at "nats"
Jul 04 15:57:55.095 info [csi-node-operator]: Initializing node operator
Jul 04 15:57:55.107 info [watcher]: node watcher sync completed
Jul 04 15:57:55.108 info [node-operator]: Initializing node operator
Jul 04 15:57:55.118 info [node-operator]: Created CRD MayastorNode
Jul 04 15:57:55.155 info [watcher]: node watcher sync completed
Jul 04 15:57:55.156 info [pool-operator]: Initializing pool operator
Jul 04 15:57:55.174 debug [watcher]: Ignoring stale pool object event
Jul 04 15:57:55.174 info [watcher]: pool watcher sync completed
Jul 04 15:57:55.174 info [volume-operator]: Initializing volume operator
Jul 04 15:57:55.182 info [volume-operator]: Created CRD MayastorVolume
Jul 04 15:57:55.210 info [watcher]: volume watcher sync completed
Jul 04 15:57:55.215 info [api]: API server listening on port 4000
Jul 04 15:57:55.215 info: MOAC is up and ready to 🚀
Jul 04 15:57:55.327 debug [csi]: probe request (ready=true)
Jul 04 15:57:55.330 debug [csi]: getPluginInfo request (name=io.openebs.csi-mayastor, version=0.1)
Jul 04 15:57:55.333 debug [csi]: getPluginCapabilities request: CONTROLLER_SERVICE, VOLUME_ACCESSIBILITY_CONSTRAINTS
Jul 04 15:57:55.338 debug [csi]: get capabilities request: CREATE_DELETE_VOLUME, PUBLISH_UNPUBLISH_VOLUME, LIST_VOLUMES, GET_CAPACITY
Jul 04 15:57:55.341 debug [csi]: get capabilities request: CREATE_DELETE_VOLUME, PUBLISH_UNPUBLISH_VOLUME, LIST_VOLUMES, GET_CAPACITY
Jul 04 15:57:56.076 debug [csi]: probe request (ready=true)
Jul 04 15:57:56.077 debug [csi]: getPluginInfo request (name=io.openebs.csi-mayastor, version=0.1)
Jul 04 15:57:56.077 debug [csi]: getPluginCapabilities request: CONTROLLER_SERVICE, VOLUME_ACCESSIBILITY_CONSTRAINTS
Jul 04 15:57:56.078 debug [csi]: get capabilities request: CREATE_DELETE_VOLUME, PUBLISH_UNPUBLISH_VOLUME, LIST_VOLUMES, GET_CAPACITY
Jul 04 15:58:11.515 warn [csi-node-operator]: Unknown mayastor node "ip-10-0-2-143.us-west-2.compute.internal" removed
Jul 04 15:58:11.520 info [node]: new mayastor node "ip-10-0-2-143.us-west-2.compute.internal" with endpoint "10.0.2.143:10124"
Jul 04 15:58:11.522 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 15:58:11.526 info [node-operator]: Creating node resource "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 15:58:11.528 info [registry]: mayastor on node "ip-10-0-2-143.us-west-2.compute.internal" and endpoint "10.0.2.143:10124" just joined
Jul 04 15:58:11.541 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 15:58:11.541 warn [node-operator]: Wanted to update state of node resource "ip-10-0-2-143.us-west-2.compute.internal" that disappeared
Jul 04 15:58:11.543 debug [node-operator]: Updating status of node resource "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 15:58:11.553 debug [watcher]: Ignoring stale node object event
Jul 04 15:58:11.709 warn [csi-node-operator]: Unknown mayastor node "ip-10-0-2-172.us-west-2.compute.internal" removed
Jul 04 15:58:11.714 info [node]: new mayastor node "ip-10-0-2-172.us-west-2.compute.internal" with endpoint "10.0.2.172:10124"
Jul 04 15:58:11.714 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 15:58:11.715 info [node-operator]: Creating node resource "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 15:58:11.716 info [registry]: mayastor on node "ip-10-0-2-172.us-west-2.compute.internal" and endpoint "10.0.2.172:10124" just joined
Jul 04 15:58:11.724 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 15:58:11.724 warn [node-operator]: Wanted to update state of node resource "ip-10-0-2-172.us-west-2.compute.internal" that disappeared
Jul 04 15:58:11.727 debug [node-operator]: Updating status of node resource "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 15:58:11.737 debug [watcher]: Ignoring stale node object event
Jul 04 15:58:12.114 warn [csi-node-operator]: Unknown mayastor node "ip-10-0-1-198.us-west-2.compute.internal" removed
Jul 04 15:58:12.119 info [node]: new mayastor node "ip-10-0-1-198.us-west-2.compute.internal" with endpoint "10.0.1.198:10124"
Jul 04 15:58:12.119 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:58:12.120 info [node-operator]: Creating node resource "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:58:12.121 info [registry]: mayastor on node "ip-10-0-1-198.us-west-2.compute.internal" and endpoint "10.0.1.198:10124" just joined
Jul 04 15:58:12.126 debug [node]: The node "ip-10-0-1-198.us-west-2.compute.internal" was successfully synced
Jul 04 15:58:12.127 warn [node-operator]: Wanted to update state of node resource "ip-10-0-1-198.us-west-2.compute.internal" that disappeared
Jul 04 15:58:12.132 debug [node-operator]: Updating status of node resource "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:58:12.139 debug [watcher]: Ignoring stale node object event
Jul 04 15:59:11.545 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 15:59:11.548 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 15:59:11.725 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 15:59:11.795 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 15:59:12.127 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:59:12.131 debug [node]: The node "ip-10-0-1-198.us-west-2.compute.internal" was successfully synced
Jul 04 15:59:41.704 debug [pool-operator]: Updating properties of pool resource "pool-on-node-1"
Jul 04 15:59:41.713 debug [node]: Creating pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" ...
Jul 04 15:59:41.716 debug [watcher]: Ignoring stale pool object event
Jul 04 15:59:43.531 info [node]: Created pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:59:43.536 debug [pool]: Adding pool "pool-on-node-1" to the list of pools on "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 15:59:43.536 debug [pool-operator]: Received "new" event for pool "pool-on-node-1"
Jul 04 15:59:43.537 debug [pool-operator]: Updating properties of pool resource "pool-on-node-1"
Jul 04 15:59:43.547 debug [watcher]: Ignoring stale pool object event
Jul 04 16:00:11.553 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 16:00:11.555 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 16:00:11.795 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 16:00:11.798 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 16:00:12.132 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:00:12.137 warn [pool]: Unexpected disk change of the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" from /dev/nvme0n1 to /dev/nvme0n1
Jul 04 16:00:12.137 debug [node]: The node "ip-10-0-1-198.us-west-2.compute.internal" was successfully synced
Jul 04 16:01:02.796 debug [csi]: Request to create volume "pvc-ddae463c-d2f8-42aa-9626-7abc6045de08" with size 1073741824 (limit 0)
Jul 04 16:01:02.798 info [volume-operator]: Creating volume resource "ddae463c-d2f8-42aa-9626-7abc6045de08"
Jul 04 16:01:02.800 debug [volume]: Creating the volume "ddae463c-d2f8-42aa-9626-7abc6045de08"
Jul 04 16:01:02.800 debug [pool]: Creating replica "ddae463c-d2f8-42aa-9626-7abc6045de08" on the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" ...
Jul 04 16:01:02.808 info [pool]: Created replica "ddae463c-d2f8-42aa-9626-7abc6045de08" on the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:02.811 debug [replica]: Adding "ddae463c-d2f8-42aa-9626-7abc6045de08" to the list of replicas for the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:02.811 debug [volume]: Replica "ddae463c-d2f8-42aa-9626-7abc6045de08@pool-on-node-1" attached to the volume
Jul 04 16:01:02.813 warn [volume-operator]: Wanted to update state of volume resource "ddae463c-d2f8-42aa-9626-7abc6045de08" that disappeared
Jul 04 16:01:02.814 debug [node]: Creating nexus "ddae463c-d2f8-42aa-9626-7abc6045de08@ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:02.817 debug [volume-operator]: Creating volume "ddae463c-d2f8-42aa-9626-7abc6045de08" in response to "new" resource event
Jul 04 16:01:02.818 debug [volume-operator]: Updating status of volume resource "ddae463c-d2f8-42aa-9626-7abc6045de08"
Jul 04 16:01:02.837 debug [watcher]: Ignoring stale volume object event
Jul 04 16:01:02.842 info [node]: Created nexus "ddae463c-d2f8-42aa-9626-7abc6045de08@ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:02.848 debug [nexus]: Adding "ddae463c-d2f8-42aa-9626-7abc6045de08" to the nexus list of node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:02.848 debug [volume]: Nexus "ddae463c-d2f8-42aa-9626-7abc6045de08@ip-10-0-1-198.us-west-2.compute.internal" attached to the volume
Jul 04 16:01:02.849 debug [volume]: Volume "ddae463c-d2f8-42aa-9626-7abc6045de08" enters FSA in pending state
Jul 04 16:01:02.849 info [volume]: Volume "ddae463c-d2f8-42aa-9626-7abc6045de08" is healthy
Jul 04 16:01:02.850 debug [volume-operator]: Updating status of volume resource "ddae463c-d2f8-42aa-9626-7abc6045de08"
Jul 04 16:01:02.851 info [volume]: Volume "ddae463c-d2f8-42aa-9626-7abc6045de08" with size 1073741824 was created
Jul 04 16:01:02.872 debug [watcher]: Ignoring stale volume object event
Jul 04 16:01:03.729 debug [csi]: Request to publish volume "ddae463c-d2f8-42aa-9626-7abc6045de08" on "mayastor://ip-10-0-1-198.us-west-2.compute.internal/10.0.1.198:10124"
Jul 04 16:01:03.729 info [nexus]: Publishing nexus "ddae463c-d2f8-42aa-9626-7abc6045de08@ip-10-0-1-198.us-west-2.compute.internal" with protocol=iscsi ...
Jul 04 16:01:03.733 info [nexus]: Nexus "ddae463c-d2f8-42aa-9626-7abc6045de08@ip-10-0-1-198.us-west-2.compute.internal" is published at "iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0"
Jul 04 16:01:03.734 debug [volume]: Volume "ddae463c-d2f8-42aa-9626-7abc6045de08" enters FSA in healthy state
Jul 04 16:01:03.734 debug [volume-operator]: Updating status of volume resource "ddae463c-d2f8-42aa-9626-7abc6045de08"
Jul 04 16:01:03.736 debug [csi]: "ddae463c-d2f8-42aa-9626-7abc6045de08" published, got uri iscsi://10.0.1.198:3260/iqn.2019-05.io.openebs:nexus-ddae463c-d2f8-42aa-9626-7abc6045de08/0
Jul 04 16:01:03.736 info [csi]: Published volume "ddae463c-d2f8-42aa-9626-7abc6045de08, share protocol: "iscsi"
Jul 04 16:01:03.748 debug [watcher]: Ignoring stale volume object event
Jul 04 16:01:11.557 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 16:01:11.584 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 16:01:11.798 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 16:01:11.801 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 16:01:12.138 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:01:12.141 warn [pool]: Unexpected disk change of the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" from /dev/nvme0n1 to /dev/nvme0n1
Jul 04 16:01:12.141 debug [pool-operator]: Received "mod" event for pool "pool-on-node-1"
Jul 04 16:01:12.142 debug [pool-operator]: Updating properties of pool resource "pool-on-node-1"
Jul 04 16:01:12.143 debug [node]: The node "ip-10-0-1-198.us-west-2.compute.internal" was successfully synced
Jul 04 16:01:12.154 debug [watcher]: Ignoring stale pool object event
Jul 04 16:02:11.585 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 16:02:11.587 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 16:02:11.801 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 16:02:11.804 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 16:02:12.145 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:02:12.148 warn [pool]: Unexpected disk change of the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" from /dev/nvme0n1 to /dev/nvme0n1
Jul 04 16:02:12.148 debug [node]: The node "ip-10-0-1-198.us-west-2.compute.internal" was successfully synced
Jul 04 16:03:11.592 debug [node]: Syncing the node "ip-10-0-2-143.us-west-2.compute.internal"
Jul 04 16:03:11.594 debug [node]: The node "ip-10-0-2-143.us-west-2.compute.internal" was successfully synced
Jul 04 16:03:11.804 debug [node]: Syncing the node "ip-10-0-2-172.us-west-2.compute.internal"
Jul 04 16:03:11.807 debug [node]: The node "ip-10-0-2-172.us-west-2.compute.internal" was successfully synced
Jul 04 16:03:12.149 debug [node]: Syncing the node "ip-10-0-1-198.us-west-2.compute.internal"
Jul 04 16:03:12.153 warn [pool]: Unexpected disk change of the pool "pool-on-node-1@ip-10-0-1-198.us-west-2.compute.internal" from /dev/nvme0n1 to /dev/nvme0n1
blaisedias commented 4 years ago

Hi, I am unable to reproduce this failure on AWS. I am using a t3.medium instance, with a single node. I attached an EBS volume of 10Gb to the node, and it shows up as /dev/nvme1n1 on the node. I successfully provisioned a volume using Mayastor and ran fio

For completeness this is the log of mayastor-grpc, from the AWS node, A mayastor iscsi volume was created, fio run, volume destroyed and another mayastor iscsi volume created

2020-07-07T14:38:26Z INFO  mayastor_agent] Agent bound to CSI at /csi/csi.sock
[2020-07-07T14:38:29Z DEBUG mayastor_agent::identity] GetPluginInfo request (io.openebs.csi-mayastor:0.1)
[2020-07-07T14:38:29Z DEBUG mayastor_agent::node] NodeGetInfo request: ID=mayastor://ip-192-168-49-11.eu-west-2.compute.internal/192.168.49.11:10124, max volumes=0
[2020-07-07T14:42:15Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:42:15Z DEBUG mayastor_agent::node] Staging volume f9d999e4-1629-4836-8702-463882b08dec to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount
[2020-07-07T14:42:15Z DEBUG mayastor_agent::node] URI is iscsi://192.168.49.11:3260/iqn.2019-05.io.openebs:nexus-f9d999e4-1629-4836-8702-463882b08dec/0
[2020-07-07T14:42:15Z DEBUG mayastor_agent::node::iscsiutil] Using /bin/mayastor-iscsiadm for iscsiadm
[2020-07-07T14:42:17Z DEBUG mayastor_agent::node] device_path is /dev/sda for uri iscsi://192.168.49.11:3260/iqn.2019-05.io.openebs:nexus-f9d999e4-1629-4836-8702-463882b08dec/0
[2020-07-07T14:42:17Z DEBUG mayastor_agent::format] Formatting device /dev/sda with a ext4 filesystem
[2020-07-07T14:42:17Z INFO  mayastor_agent::format] Device /dev/sda formatted with ext4 filesystem
[2020-07-07T14:42:17Z DEBUG mayastor_agent::mount] Mounting /dev/sda ...
[2020-07-07T14:42:17Z INFO  mayastor_agent::mount] Mounted ext4 fs on /dev/sda with opts "" to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount
[2020-07-07T14:42:17Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:42:17Z DEBUG mayastor_agent::mount] Mounting /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount ...
[2020-07-07T14:42:17Z INFO  mayastor_agent::mount] Mounted ext4 fs on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount with opts "rw" to /var/lib/kubelet/pods/847f1ed0-90de-40e9-bd5f-2ff39ddbb953/volumes/kubernetes.io~csi/pvc-f9d999e4-1629-4836-8702-463882b08dec/mount
[2020-07-07T14:42:17Z INFO  mayastor_agent::node] Published volume f9d999e4-1629-4836-8702-463882b08dec
[2020-07-07T14:42:27Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] Unmount volume f9d999e4-1629-4836-8702-463882b08dec at /var/lib/kubelet/pods/847f1ed0-90de-40e9-bd5f-2ff39ddbb953/volumes/kubernetes.io~csi/pvc-f9d999e4-1629-4836-8702-463882b08dec/mount...
[2020-07-07T14:43:02Z DEBUG mayastor_agent::mount] Unmounting /var/lib/kubelet/pods/847f1ed0-90de-40e9-bd5f-2ff39ddbb953/volumes/kubernetes.io~csi/pvc-f9d999e4-1629-4836-8702-463882b08dec/mount (flags=FORCE) ...
[2020-07-07T14:43:02Z INFO  mayastor_agent::mount] Filesystem at /var/lib/kubelet/pods/847f1ed0-90de-40e9-bd5f-2ff39ddbb953/volumes/kubernetes.io~csi/pvc-f9d999e4-1629-4836-8702-463882b08dec/mount has been unmounted
[2020-07-07T14:43:02Z INFO  mayastor_agent::node] Unpublished volume f9d999e4-1629-4836-8702-463882b08dec at /var/lib/kubelet/pods/847f1ed0-90de-40e9-bd5f-2ff39ddbb953/volumes/kubernetes.io~csi/pvc-f9d999e4-1629-4836-8702-463882b08dec/mount
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] Unstaging volume f9d999e4-1629-4836-8702-463882b08dec at /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] unstage: is iSCSI device path is /dev/sda
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] unstage: device_path /dev/sda NexusIscsi
[2020-07-07T14:43:02Z DEBUG mayastor_agent::mount] Unmounting /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount (flags=DETACH) ...
[2020-07-07T14:43:02Z INFO  mayastor_agent::mount] Filesystem at /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-f9d999e4-1629-4836-8702-463882b08dec/globalmount has been unmounted
[2020-07-07T14:43:02Z DEBUG mayastor_agent::node] unstage: iscsi detach /dev/sda
[2020-07-07T14:55:07Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:55:07Z DEBUG mayastor_agent::node] Staging volume 903c7ba0-20cc-483f-9058-f96bae73349a to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-903c7ba0-20cc-483f-9058-f96bae73349a/globalmount
[2020-07-07T14:55:07Z DEBUG mayastor_agent::node] URI is iscsi://192.168.49.11:3260/iqn.2019-05.io.openebs:nexus-903c7ba0-20cc-483f-9058-f96bae73349a/0
[2020-07-07T14:55:09Z DEBUG mayastor_agent::node] device_path is /dev/sda for uri iscsi://192.168.49.11:3260/iqn.2019-05.io.openebs:nexus-903c7ba0-20cc-483f-9058-f96bae73349a/0
[2020-07-07T14:55:09Z DEBUG mayastor_agent::format] Formatting device /dev/sda with a ext4 filesystem
[2020-07-07T14:55:09Z INFO  mayastor_agent::format] Device /dev/sda formatted with ext4 filesystem
[2020-07-07T14:55:09Z DEBUG mayastor_agent::mount] Mounting /dev/sda ...
[2020-07-07T14:55:09Z INFO  mayastor_agent::mount] Mounted ext4 fs on /dev/sda with opts "" to /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-903c7ba0-20cc-483f-9058-f96bae73349a/globalmount
[2020-07-07T14:55:09Z DEBUG mayastor_agent::node] NodeGetCapabilities request: [GetVolumeStats, StageUnstageVolume]
[2020-07-07T14:55:09Z DEBUG mayastor_agent::mount] Mounting /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-903c7ba0-20cc-483f-9058-f96bae73349a/globalmount ...
[2020-07-07T14:55:09Z INFO  mayastor_agent::mount] Mounted ext4 fs on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-903c7ba0-20cc-483f-9058-f96bae73349a/globalmount with opts "rw" to /var/lib/kubelet/pods/e0785269-d6f0-4c5f-828c-6012dde7dde5/volumes/kubernetes.io~csi/pvc-903c7ba0-20cc-483f-9058-f96bae73349a/mount
[2020-07-07T14:55:09Z INFO  mayastor_agent::node] Published volume 903c7ba0-20cc-483f-9058-f96bae73349a
[bdias@bdias-precision-tower aws]$ 

Can you check details of the mayastor pool? For the cluster I used I get

kubectl -n mayastor get msp -o yaml
apiVersion: v1
items:
- apiVersion: openebs.io/v1alpha1
  kind: MayastorPool
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"openebs.io/v1alpha1","kind":"MayastorPool","metadata":{"annotations":{},"name":"pool-ip-192-168-49-11.eu-west-2.compute.internal","namespace":"mayastor"},"spec":{"disks":["/dev/nvme1n1"],"node":"ip-192-168-49-11.eu-west-2.compute.internal"}}
    creationTimestamp: "2020-07-07T14:39:55Z"
    generation: 1
    name: pool-ip-192-168-49-11.eu-west-2.compute.internal
    namespace: mayastor
    resourceVersion: "5723"
    selfLink: /apis/openebs.io/v1alpha1/namespaces/mayastor/mayastorpools/pool-ip-192-168-49-11.eu-west-2.compute.internal
    uid: 0dbc0889-75f1-44bc-8326-1c954a165398
  spec:
    disks:
    - /dev/nvme1n1
    node: ip-192-168-49-11.eu-west-2.compute.internal
  status:
    capacity: 10724835328
    reason: ""
    state: online
    used: 67108864
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
blaisedias commented 4 years ago

I have tried with m5ad.large and it has worked for me.

FestivalBobcats commented 4 years ago

Turns out I was referencing the wrong device name in my storage pool. Apologies for the trouble!