Closed pratham-m closed 9 months ago
$ cat install-config.yaml
apiVersion: v1
baseDomain: psi.redhat.com
compute:
- architecture: ppc64le
hyperthreading: Enabled
name: worker
replicas: 3
controlPlane:
architecture: ppc64le
hyperthreading: Enabled
name: master
replicas: 3
metadata:
creationTimestamp: null
name: ppc64le-qe53c
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 192.168.128.0/24
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
libvirt:
network:
if: tt2
pullSecret: .........
sshKey: ............
$ oc --namespace=openshift-machine-api get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
cluster-autoscaler-operator 1/1 1 1 113d
cluster-baremetal-operator 1/1 1 1 113d
control-plane-machine-set-operator 1/1 1 1 113d
machine-api-controllers 1/1 1 1 113d
machine-api-operator 1/1 1 1 113d
$ oc --namespace=openshift-machine-api logs deployments/machine-api-controllers --container=machine-controller
...
...
I0321 06:14:03.212356 1 controller.go:187] ppc64le-qe6a-mr77v-master-2: reconciling Machine
I0321 06:14:03.212429 1 actuator.go:224] Checking if machine ppc64le-qe6a-mr77v-master-2 exists.
I0321 06:14:03.214188 1 client.go:142] Created libvirt connection: 0xc000640e68
I0321 06:14:03.214738 1 client.go:317] Check if "ppc64le-qe6a-mr77v-master-2" domain exists
I0321 06:14:03.215200 1 client.go:158] Freeing the client pool
I0321 06:14:03.215357 1 client.go:164] Closing libvirt connection: 0xc000640e68
I0321 06:14:03.215807 1 controller.go:313] ppc64le-qe6a-mr77v-master-2: reconciling machine triggers idempotent update
I0321 06:14:03.215858 1 actuator.go:189] Updating machine ppc64le-qe6a-mr77v-master-2
I0321 06:14:03.218036 1 client.go:142] Created libvirt connection: 0xc000641158
I0321 06:14:03.218356 1 client.go:302] Lookup domain by name: "ppc64le-qe6a-mr77v-master-2"
I0321 06:14:03.218688 1 actuator.go:364] Updating status for ppc64le-qe6a-mr77v-master-2
I0321 06:14:03.220932 1 client.go:158] Freeing the client pool
I0321 06:14:03.221011 1 client.go:164] Closing libvirt connection: 0xc000641158
I0321 06:14:03.229977 1 controller.go:187] ppc64le-qe6a-mr77v-worker-0-65rjs: reconciling Machine
I0321 06:14:03.230057 1 actuator.go:224] Checking if machine ppc64le-qe6a-mr77v-worker-0-65rjs exists.
I0321 06:14:03.232083 1 client.go:142] Created libvirt connection: 0xc000818e18
I0321 06:14:03.232471 1 client.go:317] Check if "ppc64le-qe6a-mr77v-worker-0-65rjs" domain exists
I0321 06:14:03.232807 1 client.go:158] Freeing the client pool
I0321 06:14:03.232853 1 client.go:164] Closing libvirt connection: 0xc000818e18
I0321 06:14:03.233232 1 controller.go:313] ppc64le-qe6a-mr77v-worker-0-65rjs: reconciling machine triggers idempotent update
I0321 06:14:03.233271 1 actuator.go:189] Updating machine ppc64le-qe6a-mr77v-worker-0-65rjs
I0321 06:14:03.234835 1 client.go:142] Created libvirt connection: 0xc0008190d8
I0321 06:14:03.235255 1 client.go:302] Lookup domain by name: "ppc64le-qe6a-mr77v-worker-0-65rjs"
I0321 06:14:03.235568 1 actuator.go:364] Updating status for ppc64le-qe6a-mr77v-worker-0-65rjs
I0321 06:14:03.237846 1 client.go:158] Freeing the client pool
I0321 06:14:03.237968 1 client.go:164] Closing libvirt connection: 0xc0008190d8
I0321 06:14:03.248101 1 controller.go:187] ppc64le-qe6a-mr77v-worker-0-h7b2s: reconciling Machine
I0321 06:14:03.248126 1 actuator.go:224] Checking if machine ppc64le-qe6a-mr77v-worker-0-h7b2s exists.
I0321 06:14:03.250372 1 client.go:142] Created libvirt connection: 0xc000c54988
I0321 06:14:03.250726 1 client.go:317] Check if "ppc64le-qe6a-mr77v-worker-0-h7b2s" domain exists
I0321 06:14:03.251060 1 client.go:158] Freeing the client pool
I0321 06:14:03.251087 1 client.go:164] Closing libvirt connection: 0xc000c54988
I0321 06:14:03.251454 1 controller.go:313] ppc64le-qe6a-mr77v-worker-0-h7b2s: reconciling machine triggers idempotent update
I0321 06:14:03.251466 1 actuator.go:189] Updating machine ppc64le-qe6a-mr77v-worker-0-h7b2s
I0321 06:14:03.253286 1 client.go:142] Created libvirt connection: 0xc000c54c48
I0321 06:14:03.253599 1 client.go:302] Lookup domain by name: "ppc64le-qe6a-mr77v-worker-0-h7b2s"
I0321 06:14:03.253933 1 actuator.go:364] Updating status for ppc64le-qe6a-mr77v-worker-0-h7b2s
I0321 06:14:03.256177 1 client.go:158] Freeing the client pool
I0321 06:14:03.256196 1 client.go:164] Closing libvirt connection: 0xc000c54c48
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Since this happens on ppc64le, this is most likely https://issues.redhat.com/browse/OCPBUGS-17476, which is caused by a regression in SLOF. While this is being fixed, we can add a workaround in cluster-api-provider-libvirt
We have also seen this on s390x before, we are stuck using a specific version of libvirt, libvirt-6.0.0-37.module+el8.5.0+12162+40884dd2
, since any thing later didn't seem to work for libvirt ipi installation.
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Version
Platform: Libvirt IPI
What happened?
$ openshift-install create cluster --dir=$CLUSTER_DIR --log-level=debug
fails while waiting for the worker nodes to come up.What you expected to happen?
OCP cluster creation should succeed and all worker nodes should come up. Expected o/p is as below:
How to reproduce it
Clone the repository and install pre-requisites as per https://github.com/openshift/installer/tree/master/docs/dev/libvirt#libvirt-howto
Anything else we need to know?
Issue is not specific to any OCP version and is re-producible on 4.12.x, 4.11.x, etc. Same steps work fine on RHEL 8.5 with Virsh 6.0.0
References
Below issues might not be related, but seems to have few similarities.