apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
1.98k stars 1.09k forks source link

Live Migration from NFS to LinStor doesnt create Linstor device , but a qcow file in KVM/QEMU #8032

Open hmadra opened 11 months ago

hmadra commented 11 months ago
ISSUE TYPE

Bug Report

COMPONENT NAME

Storage Provider

CLOUDSTACK VERSION

4.18

CONFIGURATION

N/A

OS / ENVIRONMENT

AlmaLinux 8

SUMMARY

When migrating a Live VM in Qemu/KVM from NFS to LinStor , rather than creating a drbd resource on the target system , Cloudstack is creating a Qemu file in /var/log/cloudstack/management/cloudstack directory

If there is not enough space in the device with log directory , live migration fails.

Oct  3 20:13:00 host02 journal[44701]: End of file while reading data: Input/output error
Oct  3 20:13:00 host02 java[44876]: libvirt:  error : no connection driver available for lxc:///
Oct  3 20:13:00 host02 java[44876]: WARN  [kvm.resource.LibvirtConnection] (agentRequest-Handler-1:) (logid:0bbc2885) Can not find a connection for Instance i-2-42-VM. Assuming the default connection.
Oct  3 20:13:01 host02 journal[44701]: internal error: Child process (/usr/bin/qemu-img create -f qcow2 -o preallocation=falloc,compat=1.1 /var/log/cloudstack/management/cloudstack/adda192d-a71b-4b7d-8745-f27db392d9a5 20971520K) unexpected exit status 1: qemu-img: /var/log/cloudstack/management/cloudstack/adda192d-a71b-4b7d-8745-f27db392d9a5: Could not resize image: Failed to resize underlying file: Could not preallocate new data: No space left on device

However, if there is enough space , It creates a Qcow file in the /var/log/cloudstack/management/cloudstack/ directory , rather than creating a DRBD device .

Other DRBD volumes are working fine on the host, and host is connected to LinStor controller and has other VMs operating normally , when the VM is created with LinStor as primary storage

[root@host02 ~]# ls -l /var/log/cloudstack/management/cloudstack/ total 20975620 -rw------- 1 root root 21478375424 Oct 3 20:30 46552b99-bd33-471f-897b-7f8de9065725 [root@host02 ~]#

EXPECTED RESULTS

During Live migration + Storage migration, Cloudstack should create a DRBD resource on the target host and move the disk image to LinStore DRBD Device.

``

boring-cyborg[bot] commented 11 months ago

Thanks for opening your first issue here! Be sure to follow the issue template!

hmadra commented 11 months ago

To Add. On Management Server , the logs show that LinStor is provisioning the drbd resource

2023-10-03 21:56:28,484 DEBUG [c.c.a.ApiServlet] (qtp63390-17:ctx-40f799d5 ctx-de9732df) (logid:808ef345) ===END=== 23.186.192.13 -- GET jobId=f380f3d1-1f32-4dbc-88d3-b03650bd931e&command=queryAsyncJobResult&response=json 2023-10-03 21:56:28,913 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Resource-group doesn't have any volume-groups, automatically assume partial mode. 2023-10-03 21:56:28,913 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Updated cs-93ebc78c-3259-475a-8ef1-b4a8c638bb97 DRBD auto verify algorithm to 'crct10dif-pclmul' 2023-10-03 21:56:28,913 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Tie breaker resource 'cs-93ebc78c-3259-475a-8ef1-b4a8c638bb97' created on DfltDisklessStorPool 2023-10-03 21:56:28,913 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Resource-definition property 'DrbdOptions/Resource/quorum' updated from 'null' to 'majority' by auto-quorum 2023-10-03 21:56:28,913 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Resource-definition property 'DrbdOptions/Resource/on-no-quorum' updated from 'null' to 'io-error' by auto-quorum 2023-10-03 21:56:29,556 INFO [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Linstor: Created drbd device: /dev/drbd1019 2023-10-03 21:56:29,569 DEBUG [o.a.c.s.d.d.LinstorPrimaryDataStoreDriverImpl] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Linstor: handleQualityOfServiceForVolumeMigration

However, migration is still initiated to the QCOW2 file in /var/log/cloudstack/management/cloudstack directory in the target node

2023-10-03 21:56:29,584 DEBUG [c.c.a.t.Request] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Seq 2-6624232101908578321: Sending { Cmd , MgmtId: 90520734315846, via: 2(osaka02), Ver: v1, Flags: 100111, [{"com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id":"43","name":"i-2-43-VM","state":"Migrating","type":"User","cpus":"2","minSpeed":"2500","maxSpeed":"2500","minRam":"(2.00 GB) 2147483648","maxRam":"(2.00 GB) 2147483648","arch":"x86_64","os":"AlmaLinux 8.3","platformEmulator":"AlmaLinux 8.3","bootArgs":"","enableHA":"false","limitCpuUse":"true","enableDynamicallyScaleVm":"false","params":{"rootdisksize":"40","Message.ReservedCapacityFreed.Flag":"false"},"uuid":"f68f346e-f498-409a-af45-35dbf09a3f59","enterHardwareSetup":"false","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"43989694-7d5c-424b-892a-c229cea39fca","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"dc67b307-bcf1-3d54-9ea9-02d28c977092","name":"NFSTest","id":"2","poolType":"NetworkFilesystem","host":"172.29.4.24","path":"/mnt/primary","port":"2049","url":"NetworkFilesystem://172.29.4.24/mnt/primary/?ROLE=Primary&STOREUUID=dc67b307-bcf1-3d54-9ea9-02d28c977092","isManaged":"false"}},"name":"ROOT-43","size":"(40.00 GB) 42949672960","path":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64","volumeId":"103","vmName":"i-2-43-VM","accountId":"2","format":"QCOW2","provisioningType":"THIN","poolId":"2","id":"103","deviceId":"0","bytesReadRate":"(0 bytes) 0","bytesWriteRate":"(0 bytes) 0","iopsReadRate":"(0 bytes) 0","iopsWriteRate":"(0 bytes) 0","hypervisorType":"KVM","directDownload":"false","deployAsIs":"false"}},"diskSeq":"0","path":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64","type":"ROOT","_details":{"storageHost":"172.29.4.24","managed":"false","storagePort":"2049","storageMigrateSecretConsumer":"93ebc78c-3259-475a-8ef1-b4a8c638bb97","storage.pool.disk.wait":"60","volumeSize":"(40.00 GB) 42949672960"}},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":"0","format":"ISO","accountId":"0","hvm":"false","bootable":"false","directDownload":"false","deployAsIs":"false"}},"diskSeq":"3","type":"ISO"}],"nics":[{"deviceId":"0","networkRateMbps":"200","defaultNic":"true","pxeDisable":"false","nicUuid":"45b4f0b0-7160-4edd-86b2-f879deeffda4","details":{"PromiscuousMode":"false","ForgedTransmits":"true","MacAddressChanges":"true","MacLearning":"false"},"dpdkEnabled":"false","uuid":"c0175816-848a-46a4-96ca-e8a3601957a0","ip":"x.x.x.x","netmask":"255.255.255.224","gateway":"x.x.x.x","mac":"1e:00:93:00:01:48","dns1":"8.8.4.4","dns2":"8.8.8.8","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://218","isolationUri":"vlan://218","isSecurityGroupEnabled":"false"}],"vcpuMaxLimit":"2","configDriveLocation":"SECONDARY","cpuQuotaPercentage":"0.89","guestOsDetails":{},"extraConfig":{}},"rollback":"false","wait":"0","bypassHostMaintenance":"false"}}] } 2023-10-03 21:56:29,672 DEBUG [c.c.a.t.Request] (AgentManager-Handler-3:null) (logid:) Seq 2-6624232101908578321: Processing: { Ans: , MgmtId: 90520734315846, via: 2, Ver: v1, Flags: 110, [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"dpdkInterfaceMapping":{},"result":"true","wait":"0","bypassHostMaintenance":"false"}}] } 2023-10-03 21:56:29,672 DEBUG [c.c.a.m.AgentAttache] (AgentManager-Handler-3:null) (logid:) Seq 2-6624232101908578321: No more commands found 2023-10-03 21:56:29,673 DEBUG [c.c.a.t.Request] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Seq 2-6624232101908578321: Received: { Ans: , MgmtId: 90520734315846, via: 2(osaka02), Ver: v1, Flags: 110, { PrepareForMigrationAnswer } } 2023-10-03 21:56:29,702 DEBUG [c.c.a.t.Request] (Work-Job-Executor-2:ctx-0889c5c8 job-863/job-864 ctx-a3083d5d) (logid:f380f3d1) Seq 1-8287186264315133972: Sending { Cmd , MgmtId: 90520734315846, via: 1(osaka01), Ver: v1, Flags: 100111, [{"com.cloud.agent.api.MigrateCommand":{"vmName":"i-2-43-VM","destIp":"172.29.2.22","migrateStorage":{"77d15b12-f83c-44a5-bc01-9a1e62d7bd64":{"serialNumber":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64"**,"diskType":"FILE","driverType":"QCOW2","source":"FILE","sourceText":"/var/log/cloudstack/management/cloudstack/93ebc78c-3259-475a-8ef1-b4a8c638bb97"**,"isSourceDiskOnStorageFileSystem":"false"}},"migrateStorageManaged":"false","migrateNonSharedInc":"false","autoConvergence":"false","isWindows":"false","vmTO":{"id":"43","name":"i-2-43-VM","state":"Migrating","type":"User","cpus":"2","minSpeed":"2500","maxSpeed":"2500","minRam":"(2.00 GB) 2147483648","maxRam":"(2.00 GB) 2147483648","arch":"x86_64","os":"AlmaLinux 8.3","platformEmulator":"AlmaLinux 8.3","bootArgs":"","enableHA":"false","limitCpuUse":"true","enableDynamicallyScaleVm":"false","params":{"rootdisksize":"40","Message.ReservedCapacityFreed.Flag":"false"},"uuid":"f68f346e-f498-409a-af45-35dbf09a3f59","enterHardwareSetup":"false","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"43989694-7d5c-424b-892a-c229cea39fca","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"dc67b307-bcf1-3d54-9ea9-02d28c977092","name":"NFSTest","id":"2","poolType":"NetworkFilesystem","host":"172.29.4.24","path":"/mnt/primary","port":"2049","url":"NetworkFilesystem://172.29.4.24/mnt/primary/?ROLE=Primary&STOREUUID=dc67b307-bcf1-3d54-9ea9-02d28c977092","isManaged":"false"}},"name":"ROOT-43","size":"(40.00 GB) 42949672960","path":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64","volumeId":"103","vmName":"i-2-43-VM","accountId":"2","format":"QCOW2","provisioningType":"THIN","poolId":"2","id":"103","deviceId":"0","bytesReadRate":"(0 bytes) 0","bytesWriteRate":"(0 bytes) 0","iopsReadRate":"(0 bytes) 0","iopsWriteRate":"(0 bytes) 0","hypervisorType":"KVM","directDownload":"false","deployAsIs":"false"}},"diskSeq":"0","path":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64","type":"ROOT","_details":{"storageHost":"172.29.4.24","managed":"false","storagePort":"2049","storageMigrateSecretConsumer":"93ebc78c-3259-475a-8ef1-b4a8c638bb97","storage.pool.disk.wait":"60","volumeSize":"(40.00 GB) 42949672960"}},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":"0","format":"ISO","accountId":"0","hvm":"false","bootable":"false","directDownload":"false","deployAsIs":"false"}},"diskSeq":"3","type":"ISO"}],"nics":[{"deviceId":"0","networkRateMbps":"200","defaultNic":"true","pxeDisable":"false","nicUuid":"45b4f0b0-7160-4edd-86b2-f879deeffda4","details":{"PromiscuousMode":"false","ForgedTransmits":"true","MacAddressChanges":"true","MacLearning":"false"},"dpdkEnabled":"false","uuid":"c0175816-848a-46a4-96ca-e8a3601957a0","ip":"x.x.x.x","netmask":"255.255.255.224","gateway":"x.x.x.x","mac":"1e:00:93:00:01:48","dns1":"8.8.4.4","dns2":"8.8.8.8","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://218","isolationUri":"vlan://218","isSecurityGroupEnabled":"false"}],"vcpuMaxLimit":"2","configDriveLocation":"SECONDARY","cpuQuotaPercentage":"0.89","guestOsDetails":{},"extraConfig":{}},"executeInSequence":"true","migrateDiskInfoList":[{"serialNumber":"77d15b12-f83c-44a5-bc01-9a1e62d7bd64","diskType":"FILE","driverType":"QCOW2","source":"FILE","sourceText":"/var/log/cloudstack/management/cloudstack/93ebc78c-3259-475a-8ef1-b4a8c638bb97","isSourceDiskOnStorageFileSystem":"false"}],"dpdkInterfaceMapping":{},"vlanToPersistenceMap":{},"wait":"10800","bypassHostMaintenance":"false"}}] } 2023-10-03 21:56:29,724 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-2:ctx-a4e8e9b3) (logid:094dcb5b) HA health check task is running...