Attaching a Storage Domain to a Data Center fails to start SPM.
Version-Release number of selected component (if applicable):
4.4.10
How reproducible:
This is generally a Disaster Recovery scenario but it's more easily to reproduce it on a s single environment with 1 VDSM host.
Steps to Reproduce:
Configure a Data Center & Storage Domain.
I also had 2 non-running VMs and a few disks (some attached and some not), but IMHO it's irrelevant.
Put the Storage Domain in maintenance mode.
Detach the Storage Domain.
Attach the Storage Domain (that was detached at previous step) back to the same Data Center.
Note: as I wrote it's a DR scenario and if moving to another environment the chance for the issue to happen is lower. Also even having 2 VDSM hosts running on the same environment reduces the chance for the bug to happen, because somehow after SD Attach, the SPM moves from one host to another and there is no race. So for the purpose of this bug's reproduction it is better work on the same environment, with 1 VDSM host.
Actual results:
Data Center & Storage Domain are down.
They are shown as up-and-running for a few seconds, but then become red, Storage Domain is locked.
Engine & VDSM logs show errors.
There is a task on the VDSM (under /rhev/data-center///master/tasks/) that is not removed:
[root@vdsm1 ~]# cd /rhev/data-center/4dc0a377-4dd3-494c-ad18-7aa2008c43b1/ecac38cc-bd4b-47b1-be42-702adc810dd3/master/tasks/
[root@vdsm1 tasks]# ll
total 4
drwxr-xr-x. 2 vdsm kvm 4096 Mar 21 11:38 b813ea20-1886-439b-8f85-bfb41256ba3b
[root@vdsm1 tasks]# sudo tar -czvf b813ea20-1886-439b-8f85-bfb41256ba3b.tar.gz b813ea20-1886-439b-8f85-bfb41256ba3b
b813ea20-1886-439b-8f85-bfb41256ba3b/
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.recover.0
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.job.0
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.task
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.result
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.recover.1
Expected results:
Data Center & Storage Domain should be green, up-and-running.
Attaching a Storage Domain to a Data Center fails to start SPM.
Version-Release number of selected component (if applicable): 4.4.10
How reproducible: This is generally a Disaster Recovery scenario but it's more easily to reproduce it on a s single environment with 1 VDSM host.
Steps to Reproduce:
Actual results: Data Center & Storage Domain are down. They are shown as up-and-running for a few seconds, but then become red, Storage Domain is locked. Engine & VDSM logs show errors. There is a task on the VDSM (under /rhev/data-center///master/tasks/) that is not removed:
[root@vdsm1 ~]# cd /rhev/data-center/4dc0a377-4dd3-494c-ad18-7aa2008c43b1/ecac38cc-bd4b-47b1-be42-702adc810dd3/master/tasks/
[root@vdsm1 tasks]# ll
total 4
drwxr-xr-x. 2 vdsm kvm 4096 Mar 21 11:38 b813ea20-1886-439b-8f85-bfb41256ba3b
[root@vdsm1 tasks]# sudo tar -czvf b813ea20-1886-439b-8f85-bfb41256ba3b.tar.gz b813ea20-1886-439b-8f85-bfb41256ba3b
b813ea20-1886-439b-8f85-bfb41256ba3b/
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.recover.0
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.job.0
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.task
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.result
b813ea20-1886-439b-8f85-bfb41256ba3b/b813ea20-1886-439b-8f85-bfb41256ba3b.recover.1
Expected results: Data Center & Storage Domain should be green, up-and-running.
Additional info: Logs are attached in bz
Original bz: https://bugzilla.redhat.com/2067173