stormshift / support

This repo should serve as a central source for reporting issues with stormshift
GNU General Public License v3.0
3 stars 0 forks source link

Unhealthy ODF in OCP4 #79

Closed Javatar81 closed 2 years ago

Javatar81 commented 2 years ago

noobaa-db-pg-0

Unable to attach or mount volumes: unmounted volumes=[db], unattached volumes=[noobaa-postgres-config-volume noobaa-postgres-initdb-sh-volume db kube-api-access-nbczv]: timed out waiting for the condition

MountVolume.MountDevice failed for volume "pvc-0096de38-19d3-46d4-91f5-c937ebe4e243" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0011-openshift-storage-0000000000000001-86730093-1d17-11ec-8524-0a580a830015 already exists

rook-ceph-osd

Generated from kubelet on [compute-2.ocp4.stormshift.coe.muc.redhat.com](https://console-openshift-console.apps.ocp4.stormshift.coe.muc.redhat.com/k8s/cluster/nodes/compute-2.ocp4.stormshift.coe.muc.redhat.com) Unable to attach or mount volumes: unmounted volumes=[ocs-deviceset-2-data-0-96rfj], unattached volumes=[ocs-deviceset-2-data-0-96rfj rook-ceph-crash run-udev ocs-deviceset-2-data-0-96rfj-bridge kube-api-access-sbv7f rook-data rook-config-override rook-ceph-log]: timed out waiting for the condition

MapVolume.EvalHostSymlinks failed for volume "local-pv-23676bf0" : lstat /dev/disk/by-id/nvme-INTEL_SSDPEDMD020T4_CVFT546100392P0EGN: no such file or directory

Javatar81 commented 2 years ago

Found this https://github.com/rook/rook/issues/4896

DanielFroehlich commented 2 years ago

Looks like we lost the NVME PCI passthrough during update, probably due to RHEV cluster version update. Re-attached the nvmes to the VMs using RHEV Console: image VMs are starting, lets see if this helps...

DanielFroehlich commented 2 years ago

image

LGTM!