ocp-power-automation / ocp4-upi-kvm

OCP4 on KVM/Power
Apache License 2.0
17 stars 20 forks source link

Cannot run terraform apply again after successful deployment of OCP on KVM #76

Closed gitsridhar closed 3 years ago

gitsridhar commented 3 years ago

I am working to add support for adding a node to OCP deployed on KVM (using the wrapper code from ocs-upi-kvm project). After successfully deploying OCP on KVM, I would like to increase the number of worker node count, and add trigger for it etc. But as the first step, I would like to re-run the terraform apply and it fails with this error: terraform apply -var-file var.tfvars -var-file ~/site.tfvars -auto-approve -parallelism=3

Error: Error retrieving volume for disk: virError(Code=50, Domain=18, Message='Storage volume not found: no storage vol with matching path '/var/lib/libvirt/images/test-ocp4-6/disk-worker0.data-vdc'')

Error: Error retrieving volume for disk: virError(Code=50, Domain=18, Message='Storage volume not found: no storage vol with matching path '/var/lib/libvirt/images/test-ocp4-6/disk-worker1.data-vdc'')

Error: Error retrieving volume for disk: virError(Code=50, Domain=18, Message='Storage volume not found: no storage vol with matching path '/var/lib/libvirt/images/test-ocp4-6/disk-worker2.data-vdc'')

and looking at the image files: [test@nx123-ahv ~]$ sudo ls -la /var/lib/libvirt/images/test-ocp4-6 total 41172828 drwx--x--x. 2 root root 4096 Dec 18 14:51 . drwx--x--x. 3 root root 4096 Dec 18 14:14 .. -rw-r--r--. 1 qemu qemu 274877906944 Dec 18 14:51 disk-worker0.data-vdc -rw-r--r--. 1 qemu qemu 274877906944 Dec 18 14:51 disk-worker1.data-vdc -rw-r--r--. 1 qemu qemu 274877906944 Dec 18 14:51 disk-worker2.data-vdc -rw-r--r--. 1 qemu qemu 3140288512 Dec 18 15:14 test-ocp4-6-bastion-vol -rw-r--r--. 1 qemu qemu 1738604544 Dec 18 15:17 test-ocp4-6-bootstrap -rw-r--r--. 1 qemu qemu 441 Dec 18 14:23 test-ocp4-6-bootstrap.ign -rw-r--r--. 1 qemu qemu 6620053504 Dec 18 15:17 test-ocp4-6-master-0 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-master-0.ign -rw-r--r--. 1 qemu qemu 6630080512 Dec 18 15:17 test-ocp4-6-master-1 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-master-1.ign -rw-r--r--. 1 qemu qemu 6640435200 Dec 18 15:17 test-ocp4-6-master-2 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-master-2.ign -rw-r--r--. 1 qemu qemu 2426012288 Dec 18 14:23 test-ocp4-6-rhcos-base-vol -rw-r--r--. 1 qemu qemu 5722865664 Dec 18 15:17 test-ocp4-6-worker-0 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-worker-0.ign -rw-r--r--. 1 qemu qemu 4729274368 Dec 18 15:17 test-ocp4-6-worker-1 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-worker-1.ign -rw-r--r--. 1 qemu qemu 4493213696 Dec 18 15:17 test-ocp4-6-worker-2 -rw-r--r--. 1 qemu qemu 434 Dec 18 14:23 test-ocp4-6-worker-2.ign [test@nx123-ahv ~]$

Is there a configuration step missing?

gitsridhar commented 3 years ago

Tried again with an increased number of workers in the variables file. New VM got created, but it is failing to boot. This message is seen in virsh console: [ 490.593053] ignition[770]: GET error: Get "http://192.168.88.2:8080/ignition/worker.ign": dial tcp 192.168.88.2:8080: connect: network is unreachable [ ***] A start job is running for Ignition (fetch) (8min 12s / no limit)[ 495.595159] ignition[770]: GET http://192.168.88.2:8080/ignition/worker.ign: attempt #102

Then entered into emergency shell:

Generating "/run/initramfs/rdsosreport.txt"

Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report.

:/#

From the KVM host, I can do wget successfully.

wget http://192.168.88.2:8080/ignition/worker.ign

yussufsh commented 3 years ago

Closing as stale issue. Please re-open if you can still reproduce.