embercsi / ember-csi-operator

Operator to create/configure/manage Ember CSI Driver atop Kubernetes/OpenShift
Apache License 2.0
3 stars 6 forks source link

StorageBackend vSphere Failed as Operator in OKD 4.5 #110

Open creativie opened 4 years ago

creativie commented 4 years ago

OKD 4.5 beta (Fedora CoreOS) Ember-csi operator

When I add new storage backend with vsphere driver all pod become failing

Error: container create failed: time="2020-06-24T06:40:26Z" level=warning msg="exit status 1" time="2020-06-24T06:40:26Z" level=error msg="container_linux.go:349: starting container process caused \"process_linux.go:449: container init caused \\\"rootfs_linux.go:58: mounting \\\\\\\"/etc/localtime\\\\\\\" to rootfs \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\\\\\" at \\\\\\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\\\\\" caused \\\\\\\"not a directory\\\\\\\"\\\"\"" container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/etc/localtime\\\" to rootfs \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged\\\" at \\\"/var/lib/containers/storage/overlay/c87616da1d0f51f436eacf9e97bc4622c0285aad28edbcc08a1ec7283d7f930c/merged/usr/share/zoneinfo/UTC\\\" caused \\\"not a directory\\\"\""

I delete volume '/etc/localtime' from statefulset and daemonset, after pods starts successfully

creativie commented 4 years ago

Log from Node

2020-06-25 09:54:51 vsphere INFO ember_csi.common [req-38afb5f9-64ff-4dc0-b7a1-c8a4ef25f46f] => GRPC NodeGetCapabilities
2020-06-25 09:54:51 vsphere INFO ember_csi.common [req-38afb5f9-64ff-4dc0-b7a1-c8a4ef25f46f] <= GRPC NodeGetCapabilities served in 0s
2020-06-25 09:54:51 vsphere INFO ember_csi.common [req-37df5339-fcc3-4909-ae42-894697c16397] => GRPC NodeStageVolume 9fa27601-931c-4573-af8d-d4dea0cb95b6
2020-06-25 09:55:24 vsphere ERROR ember_csi.common [req-37df5339-fcc3-4909-ae42-894697c16397] !! GRPC NodeStageVolume failed in 33s with Unexpected exception ()
    Traceback (most recent call last):
      File "/ember-csi/ember_csi/common.py", line 129, in dolog
        result = f(self, request, context)
      File "/ember-csi/ember_csi/common.py", line 224, in checker
        return f(self, request, context)
      File "/ember-csi/ember_csi/common.py", line 76, in wrapper
        return func(self, request, context)
      File "/ember-csi/ember_csi/base.py", line 1024, in NodeStageVolume
        self._format_device(vol, fs_type, private_bind, context)
      File "/ember-csi/ember_csi/base.py", line 854, in _format_device
        errors=[1, 32], delay=2)
      File "/ember-csi/ember_csi/base.py", line 274, in sudo
        root_helper=self.root_helper)
      File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 424, in execute
        cmd=sanitized_cmd)
    ProcessExecutionError: Unexpected error while running command.
    Command: lsblk -nlfoFSTYPE /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6
    Exit code: 1
    Stdout: u''
    Stderr: u'lsblk: /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6: not a block device\n': ProcessExecutionError: Unexpected error while running command.
2020-06-25 09:55:24 vsphere ERROR grpc._server [req-37df5339-fcc3-4909-ae42-894697c16397] Exception calling application: Unexpected error while running command.
Command: lsblk -nlfoFSTYPE /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6
Exit code: 1
Stdout: u''
Stderr: u'lsblk: /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6: not a block device\n': ProcessExecutionError: Unexpected error while running command.
Command: lsblk -nlfoFSTYPE /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6
Exit code: 1
Stdout: u''
Stderr: u'lsblk: /var/lib/ember-csi/vols/9fa27601-931c-4573-af8d-d4dea0cb95b6: not a block device\n'
Akrog commented 4 years ago

These are 2 different bugs, one is an Operator bug (the /etc/localtime), and the other is a feature limitation of Ember-CSI, as described in Ember-CSI issue #168.

The /etc/localtime error given by the system is misleading, as the problem is that the host is missing the file:

$ ls /etc/localtime
ls: cannot access '/etc/localtime': No such file or directory

We'll work to fix the /etc/localtime issue, since the workaround of deleting the mount from StatefulSet and DaemonSet is not a convenient way of doing it, and the operator should have a way to disable the mounting of the host file for systems that don't have it, even if then the logs will not have the right timestamps.