Closed rbo closed 2 years ago
✅ Trust bundle is well configured (oc get proxy -o yaml
)
$ oc logs -n openshift-cluster-csi-drivers deploy/ovirt-csi-driver-controller -c csi-attacher | tail
Found 2 pods, using pod/ovirt-csi-driver-controller-7cd6cb7dbd-swr4r
I0215 10:12:08.759614 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:12:08.800175 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: rpc error: code = Unknown desc = failed finding disk attachments: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 2319536d-8e41-4f3a-b375-e6edefeb8316, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
I0215 10:15:41.139750 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:15:41.206202 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: rpc error: code = Unknown desc = failed finding disk attachments: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 2319536d-8e41-4f3a-b375-e6edefeb8316, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
I0215 10:20:41.209763 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:20:41.294554 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: rpc error: code = Unknown desc = failed finding disk attachments: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 2319536d-8e41-4f3a-b375-e6edefeb8316, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
I0215 10:22:08.761003 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:22:08.895899 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: rpc error: code = Unknown desc = failed finding disk attachments: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 2319536d-8e41-4f3a-b375-e6edefeb8316, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
I0215 10:25:41.295317 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:25:41.364480 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: rpc error: code = Unknown desc = failed finding disk attachments: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 2319536d-8e41-4f3a-b375-e6edefeb8316, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
$
Problem is the cloud provider settings:
$ oc get secret ovirt-credentials -n openshift-cluster-csi-drivers -o jsonpath="{.data.ovirt_ca_bundle}" | base64 -d | openssl x509 -noout -issuer -subject
issuer=C = US, O = stormshift.coe.muc.redhat.com, CN = rhev.stormshift.coe.muc.redhat.com.26039
subject=C = US, O = stormshift.coe.muc.redhat.com, CN = rhev.stormshift.coe.muc.redhat.com.26039
$
$ echo | openssl s_client -connect rhev.stormshift.coe.muc.redhat.com:443 2>/dev/null| openssl x509 -noout -subject -issuer
subject=O = Red Hat, OU = SolutionArchitectsDach, CN = *.stormshift.coe.muc.redhat.com
issuer=O = Red Hat, OU = prod, CN = Certificate Authority
$
I have to update cloud-provider ca bundle
$ BUNDLE=$(oc get cm user-ca-bundle -n openshift-config -o jsonpath="{.data.ca-bundle\.crt}" | base64 -w0 )
$ kubectl patch secret -n kube-system ovirt-credentials --type='json' -p="[{\"op\" : \"replace\" ,\"path\" : \"/data/ovirt_ca_bundle\" ,\"value\" : \"$BUNDLE\"}]"
secret/ovirt-credentials patched
$
$ oc get secret ovirt-credentials -n openshift-cluster-csi-drivers -o jsonpath="{.data.ovirt_ca_bundle}" | base64 -d | openssl x509 -noout -issuer -subject
issuer=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com
subject=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com
$
$ oc delete pods -n openshift-cluster-csi-drivers -l app=ovirt-csi-driver-controller --wait=false
pod "ovirt-csi-driver-controller-7cd6cb7dbd-89g7x" deleted
pod "ovirt-csi-driver-controller-7cd6cb7dbd-8jqmb" deleted
$ oc logs -n openshift-cluster-csi-drivers deploy/ovirt-csi-driver-controller -c csi-attacher -f
Found 2 pods, using pod/ovirt-csi-driver-controller-7cd6cb7dbd-wzhzq
I0215 10:40:45.894832 1 main.go:99] Version: v4.9.0-202111151318.p0.g0a1737c.assembly.stream-0-gd002fb1-dirty
I0215 10:40:48.103344 1 common.go:111] Probing CSI driver for readiness
I0215 10:40:48.185931 1 main.go:155] CSI driver name: "csi.ovirt.org"
I0215 10:40:48.186623 1 main.go:181] ServeMux listening at "localhost:8203"
I0215 10:40:48.191908 1 main.go:230] CSI driver supports ControllerPublishUnpublish, using real CSI handler
I0215 10:40:48.197642 1 leaderelection.go:248] attempting to acquire leader lease openshift-cluster-csi-drivers/external-attacher-leader-csi-ovirt-org...
I0215 10:40:48.407360 1 leaderelection.go:258] successfully acquired lease openshift-cluster-csi-drivers/external-attacher-leader-csi-ovirt-org
I0215 10:40:48.408393 1 leader_election.go:205] became leader, starting
I0215 10:40:48.408515 1 controller.go:128] Starting CSI attacher
I0215 10:40:48.509847 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:40:48.680278 1 csi_handler.go:587] Detached "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:40:48.843786 1 csi_handler.go:279] Detaching "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:40:48.932696 1 csi_handler.go:587] Detached "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740"
I0215 10:40:48.963317 1 csi_handler.go:286] Failed to save detach error to "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": volumeattachments.storage.k8s.io "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740" not found
I0215 10:40:48.963470 1 csi_handler.go:231] Error processing "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740": failed to detach: could not mark as detached: volumeattachments.storage.k8s.io "csi-42f00fa7db867ea339464072e81d182e02b313e87561dd5fd80e492bbf0c3740" not found
I0215 10:40:53.957529 1 csi_handler.go:248] Attaching "csi-ce8c06362d90cfdc0d2a48930b7f4b8c90670ad5136b84baf7c1b0e355400159"
I0215 10:40:54.727871 1 csi_handler.go:261] Attached "csi-ce8c06362d90cfdc0d2a48930b7f4b8c90670ad5136b84baf7c1b0e355400159"
I0215 10:40:54.953165 1 csi_handler.go:248] Attaching "csi-ce8c06362d90cfdc0d2a48930b7f4b8c90670ad5136b84baf7c1b0e355400159"
I0215 10:40:55.060104 1 csi_handler.go:261] Attached "csi-ce8c06362d90cfdc0d2a48930b7f4b8c90670ad5136b84baf7c1b0e355400159"
^C
$
SOLVED
Heads up @cluster/rhacm-admin - the "cluster/rhacm" label was applied to this issue.
MountVolume.MountDevice failed for volume "pvc-ac0c836d-e908-4a9d-8e6e-745bf505a01f" : rpc error: code = Unknown desc = failed finding disk attachments, error: failed to get disk attachment by disk 10a6f4ab-59c9-464c-8ad7-8108880a974b for VM 5579854e-5f70-4148-9d4b-24fcf7b46197, error: Post "https://rhev.stormshift.coe.muc.redhat.com/ovirt-engine/sso/oauth/token": x509: certificate signed by unknown authority
Not solved at all, it's hard to find all components
$ oc delete pods -n openshift-cluster-csi-drivers -l app=ovirt-csi-driver-node --wait=false
pod "ovirt-csi-driver-node-997nn" deleted
pod "ovirt-csi-driver-node-cb27c" deleted
pod "ovirt-csi-driver-node-fjr8p" deleted
pod "ovirt-csi-driver-node-hz64p" deleted
pod "ovirt-csi-driver-node-n4gsl" deleted
My pods start again, let the ticket open to find all components and restart all pods :-( Maybe restarting the entire cluster is the easiest way..
Source:
https://access.redhat.com/solutions/5416491
Check secrets in namespaces:
# List of namespaces
$ oc -n openshift-cloud-credential-operator get credentialsrequest -o json | jq -c '.items[] | select(.status.provisioned) | .spec.secretRef'
{"name":"ovirt-credentials","namespace":"openshift-machine-api"}
{"name":"ovirt-credentials","namespace":"openshift-cluster-csi-drivers"}
$ oc get secret ovirt-credentials -n openshift-cluster-csi-drivers -o jsonpath="{.data.ovirt_ca_bundle}" | base64 -d | openssl x509 -noout -issuer -subject issuer=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com subject=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com $ oc get secret ovirt-credentials -n openshift-machine-api -o jsonpath="{.data.ovirt_ca_bundle}" | base64 -d | openssl x509 -noout -issuer -subject issuer=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com subject=C = US, ST = North Carolina, L = Raleigh, O = "Red Hat, Inc.", OU = Red Hat IT, CN = Red Hat IT Root CA, emailAddress = infosec@redhat.com $
$ oc delete pods -n openshift-machine-api --all --wait=false pod "cluster-autoscaler-operator-584c78fbd8-lbgfl" deleted pod "cluster-baremetal-operator-5f4ccb4899-hqpvj" deleted pod "machine-api-controllers-7c8cc5b994-x4w4m" deleted pod "machine-api-operator-648f59c644-x9qfj" deleted
$ oc delete pods -n openshift-cluster-csi-drivers --all --wait=false ...
Force kubecontrollermanager rollout:
$ oc patch kubecontrollermanager cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date )"'"}}' --type=merge kubecontrollermanager.operator.openshift.io/cluster patched $
LGTM
After Cert exchange of our rhev infrastructure (Ticket ID #56) the CSI doesn't work anymore.