ScaleComputing / HyperCoreAnsibleCollection

Official Ansible collection for Scale Computing SC//HyperCore (HC3) v1 API
GNU General Public License v3.0
12 stars 8 forks source link

vm module fails to remove disk #249

Closed justinc1 closed 1 year ago

justinc1 commented 1 year ago

Describe the bug

https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/5383096735/jobs/9769417342#step:9:49 vm module failed to remove disk from existing VM.

VM demo-vm was running, and had one extra disk:

https://10.5.11.201/rest/v1/VirDomain/8c6196be-ddb5-4357-9783-50869dc60969
[{"uuid":"8c6196be-ddb5-4357-9783-50869dc60969","nodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","name":"demo-vm","description":"demo-vm","operatingSystem":"os_other","state":"RUNNING","desiredDisposition":"RUNNING","console":{"type":"VNC","ip":"10.5.11.201","port":5902,"keymap":"en-us"},"mem":1073741824,"numVCPU":2,"blockDevs":[{"uuid":"98d5727b-fef7-4c96-8634-6abd6d5ff6b7","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO_DISK","cacheMode":"WRITETHROUGH","capacity":10737418240,"allocation":0,"physical":0,"shareUUID":"","path":"scribe/98d5727b-fef7-4c96-8634-6abd6d5ff6b7","slot":0,"name":"","disableSnapshotting":false,"tieringPriorityFactor":0,"mountPoints":[],"createdTimestamp":1681294172,"readOnly":false},{"uuid":"3aad4f9d-71d0-4be7-bd1d-d0ad593d4789","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO_DISK","cacheMode":"NONE","capacity":645922816,"allocation":645922816,"physical":0,"shareUUID":"","path":"scribe/3aad4f9d-71d0-4be7-bd1d-d0ad593d4789","slot":1,"name":"","disableSnapshotting":false,"tieringPriorityFactor":0,"mountPoints":[],"createdTimestamp":1685248572,"readOnly":false}],"netDevs":[{"uuid":"7b111f0c-77a1-4840-b425-63046e63f995","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO","macAddress":"7C:4C:58:6B:36:32","vlan":10,"connected":true,"ipv4Addresses":[]}],"stats":[],"created":0,"modified":0,"latestTaskTag":{"taskTag":"36199","progressPercent":0,"state":"ERROR","formattedDescription":"Delete block device %@ for Virtual Machine %@","descriptionParameters":["3aad4f9d","demo-vm"],"formattedMessage":"Unable to delete block device from VM '%@': Still in use","messageParameters":["demo-vm"],"objectUUID":"8c6196be-ddb5-4357-9783-50869dc60969","created":1687925214,"modified":1687925276,"completed":1687925276,"sessionID":"d4fa7269-caa0-4a0a-b5d8-85d8601e93c4","nodeUUIDs":["3dcb0c96-f013-4ccc-b639-33605ea78c44"]},"tags":"Xlab","bootDevices":["98d5727b-fef7-4c96-8634-6abd6d5ff6b7"],"uiState":"RUNNING","snapUUIDs":[],"snapshotSerialNumber":0,"replicationUUIDs":[],"sourceVirDomainUUID":"","snapshotListSerialNumber":0,"snapshotScheduleUUID":"","machineType":"scale-7.2","cpuType":"clusterBaseline-7.3","snapshotAllocationBlocks":0,"guestAgentState":"UNAVAILABLE","lastSeenRunningOnNodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","isTransient":false,"affinityStrategy":{"strictAffinity":false,"preferredNodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","backupNodeUUID":""},"vsdUUIDsToDelete":[],"cloudInitData":{"userData":"","metaData":""}}]

https://10.5.11.201/rest/v1/TaskTag/36199
[{"taskTag":"36199","progressPercent":0,"state":"ERROR","formattedDescription":"Delete block device %@ for Virtual Machine %@","descriptionParameters":["3aad4f9d","demo-vm"],"formattedMessage":"Unable to delete block device from VM '%@': Still in use","messageParameters":["demo-vm"],"objectUUID":"8c6196be-ddb5-4357-9783-50869dc60969","created":1687925214,"modified":1687925276,"completed":1687925276,"sessionID":"d4fa7269-caa0-4a0a-b5d8-85d8601e93c4","nodeUUIDs":["3dcb0c96-f013-4ccc-b639-33605ea78c44"]}]

To Reproduce Steps to reproduce the behavior:

  1. Run prepare-examples.yml to create a new demo-vm
  2. Add extra disk to demo-vm
  3. Start demo-vm
  4. Run prepare-examples.yml again
  5. See error

Expected behavior

vm module should shutdown demo-vm and remove the extra disk.

Also error message should include details - taskTag, formattedDescription, formattedMessage, etc.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

Additional context

justinc1 commented 1 year ago

The problematic part in code is in Disk.needs_reboot,action == "delete" and self.type == "ide_cdrom", https://github.com/ScaleComputing/HyperCoreAnsibleCollection/blob/main/plugins/module_utils/disk.py#L202. It limits VM shutdown before disk remove only type is ide_cdrom.

I tested via management UI (HyperCore v9.2.17, on NUC). VM was booted, there was no OS running (empty disks).

That time difference is suspicious. Maybe HyperCore would actually remove (VIRTIO) disks from running VM if some condition would be fulfilled?

In both cases a warning pops up:

Confirm Delete Block Device
Are you sure you want to delete the block device
0f513732 (101 GB)?

If this drive is in use by the guest OS it may not be removed until the next VM reboot.

If I force shutdown and start VM back after delete failed, the disks are not removed. I'm confused when would that "delayed delete" happen.

Update: did the same test on 9.1.14 VSNS (https://10.5.11.200/):

So VIRTIO disk can be removed from running VM on HyperCore 9.1.14, but not on 9.2.17. I wonder if IDE disk can be removed on from running VM on some other version? And how is HyperCore supposed to behave if we want to attach a new disk to running VM - this should be possible without reboot, right?