Closed rohityadavcloud closed 11 months ago
cc @shwstppr @DaanHoogland @GutoVeronezi @weizhouapache @harikrishna-patnala
I would have preferred at least a global setting to allow selecting between full snapshots (historical/expected behaviour) or the new approach which has broken snapshots in my env (possibly for other users). I force deleted an older template which I couldn't find any references for assuming we do full snapshots, and ended up breaking all (recurring) snapshots since the 4.17/4.18 upgrade in my homelab env (I'm hoping wife doesn't find this out, who'll be super pissed if any backup/data was lost).
if this issue is confirmed, I think it is a blocker. the volume snapshot should be a full clone of the vm.
I checked the docs, we don't say we do incremental/linked snapshots for KVM or have any other information for users - https://docs.cloudstack.apache.org/en/4.18.1.0/adminguide/storage.html#incremental-snapshots-and-backup
I think full snapshots were slow but really effective, we need to revert the changes or at least make it configurable via some global setting (default set to full snapshots without any backing file for KVM volume snapshots).
@weizhouapache yes, agree - I've confirmed with my testing above against ACS 4.18.1.0 and Ubuntu 22.04 + local storage, welcome anybody to test/confirm the same with nfs and other storages (the behaviour is storage agnostic and may affect wide user community, and potential data loss like I incurred).
I'm able to reproduce this behviour and problem with snapshots having backing_files.
2023-10-04 07:57:28,675 ERROR [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-19:ctx-9609c044 job-129/job-131 ctx-a74d034e) (logid:84378832) Unable to create volume [{"name":"ROOT-11","uuid":"62662375-984a-4ad2-be79-49e4a156133d"}] due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2 to c57053ea-ea69-4c2c-846b-39e7a8dc2248].
2023-10-04 07:57:28,675 WARN [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-19:ctx-9609c044 job-129/job-131 ctx-a74d034e) (logid:84378832) Unable to contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:1] is unreachable: Unable to create volume [{"name":"ROOT-11","uuid":"62662375-984a-4ad2-be79-49e4a156133d"}] due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2 to c57053ea-ea69-4c2c-846b-39e7a8dc2248].
at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1814)
at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1923)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
at com.sun.proxy.$Proxy227.prepare(Unknown Source)
at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1266)
at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:5412)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:5536)
at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Agent logs from host
2023-10-04 07:57:28,560 DEBUG [kvm.storage.KVMStorageProcessor] (agentRequest-Handler-1:null) (logid:84378832) Copying template to primary storage, template format is qcow2
2023-10-04 07:57:28,560 INFO [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) Trying to fetch storage pool 74659b4e-12be-388d-9877-2c992e20b732 from libvirt
2023-10-04 07:57:28,560 DEBUG [kvm.resource.LibvirtConnection] (agentRequest-Handler-1:null) (logid:84378832) Looking for libvirtd connection at: qemu:///system
2023-10-04 07:57:28,562 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) Successfully refreshed pool 74659b4e-12be-388d-9877-2c992e20b732 Capacity: (1.9685 TB) 2164372013056 Used: (253.98 GB) 272709451776 Available: (1.7205 TB) 1891662561280
2023-10-04 07:57:28,562 INFO [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) Trying to fetch storage pool 74659b4e-12be-388d-9877-2c992e20b732 from libvirt
2023-10-04 07:57:28,562 DEBUG [kvm.resource.LibvirtConnection] (agentRequest-Handler-1:null) (logid:84378832) Looking for libvirtd connection at: qemu:///system
2023-10-04 07:57:28,564 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) Successfully refreshed pool 74659b4e-12be-388d-9877-2c992e20b732 Capacity: (1.9685 TB) 2164372013056 Used: (253.98 GB) 272709451776 Available: (1.7205 TB) 1891662561280
2023-10-04 07:57:28,564 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) copyPhysicalDisk: disk size:(1.00 MB) 1048576, virtualsize:(50.00 MB) 52428800 format:qcow2
2023-10-04 07:57:28,564 INFO [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) (logid:84378832) Attempting to create volume c57053ea-ea69-4c2c-846b-39e7a8dc2248 (NetworkFilesystem) in pool 74659b4e-12be-388d-9877-2c992e20b732 with size (50.00 MB) 52428800
2023-10-04 07:57:28,564 DEBUG [kvm.resource.LibvirtConnection] (agentRequest-Handler-1:null) (logid:84378832) Looking for libvirtd connection at: qemu:///system
2023-10-04 07:57:28,565 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing: qemu-img create -o preallocation=off -f qcow2 /mnt/74659b4e-12be-388d-9877-2c992e20b732/c57053ea-ea69-4c2c-846b-39e7a8dc2248 52428800
2023-10-04 07:57:28,566 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing while with timeout : 3600000
2023-10-04 07:57:28,590 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Execution is successful.
2023-10-04 07:57:28,590 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Formatting '/mnt/74659b4e-12be-388d-9877-2c992e20b732/c57053ea-ea69-4c2c-846b-39e7a8dc2248', fmt=qcow2 size=52428800 encryption=off cluster_size=65536 preallocation='off' lazy_refcounts=off
2023-10-04 07:57:28,590 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing: qemu-img info /mnt/74659b4e-12be-388d-9877-2c992e20b732/c57053ea-ea69-4c2c-846b-39e7a8dc2248
2023-10-04 07:57:28,591 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing while with timeout : 3600000
2023-10-04 07:57:28,604 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Execution is successful.
2023-10-04 07:57:28,604 DEBUG [kvm.resource.LibvirtConnection] (agentRequest-Handler-1:null) (logid:84378832) Looking for libvirtd connection at: qemu:///system
2023-10-04 07:57:28,605 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing: qemu-img info /mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2
2023-10-04 07:57:28,607 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing while with timeout : 3600000
2023-10-04 07:57:28,624 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Execution is successful.
2023-10-04 07:57:28,625 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing: qemu-img convert -O qcow2 /mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2 /mnt/74659b4e-12be-388d-9877-2c992e20b732/c57053ea-ea69-4c2c-846b-39e7a8dc2248
2023-10-04 07:57:28,626 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Executing while with timeout : 10800000
2023-10-04 07:57:28,644 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) Exit value is 1
2023-10-04 07:57:28,644 DEBUG [utils.script.Script] (agentRequest-Handler-1:null) (logid:84378832) qemu-img: Could not open '/mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2': Could not open backing file: Could not open '/mnt/760626fb-2450-3276-ad82-8a4cc91110b5/a9388628-b90d-41ec-917a-6008011e779c': No such file or directoryqemu-img: Could not open '/mnt/99ceb614-2662-38a9-843c-53b0d26b3515/f7f32a7d-5170-472a-9ca4-51556ff7d0f5.qcow2'
I did not see the problem if the snapshot file on secondary store is rebased while backing up,
diff --git a/scripts/storage/qcow2/managesnapshot.sh b/scripts/storage/qcow2/managesnapshot.sh
index 2daa768347..1c840d34d2 100755
--- a/scripts/storage/qcow2/managesnapshot.sh
+++ b/scripts/storage/qcow2/managesnapshot.sh
@@ -230,7 +230,12 @@ backup_snapshot() {
#Backup volume snapshot
cp "$disk" "${destPath}/${destName}"
ret_code=$?
-
+ if [ $ret_code -gt 0 ]
+ then
+ printf "Failed to copy $snapshotname for disk $disk to $destPath\n" >&2
+ return 2
+ fi
+ qemuimg_ret=$($qemu_img rebase -b "" "${destPath}/${destName}" 2>&1)
if [ $ret_code -gt 0 ]
then
printf "Failed to backup $snapshotname for disk $disk to $destPath\n" >&2
Thanks @shwstppr - but wouldn't that (potentially) cause regression is what #5297 tried to do, maybe @GutoVeronezi you can review and suggest.
Since #5297 has been there since ACS 4.17, reverting the behaviour would cause (potential) regression for newer users/env, but not fixing this issue is causing issues for older env/users who expect full volume snapshots (even docs don't advise this). We need to find some middle ground solution.
/cc @weizhouapache @DaanHoogland @harikrishna-patnala @nvazquez @shwstppr @GutoVeronezi
@rohityadavcloud @shwstppr (cc @GutoVeronezi ) I think @shwstppr 's solution should be default and be disabled by a flag (old behaviour takes precedence)
@rohityadavcloud @shwstppr (cc @GutoVeronezi ) I think @shwstppr 's solution should be default and be disabled by a flag (old behaviour takes precedence)
+1 on this. The old behaviour has always been rock solid and I'm against changing it. Don't mind the new way as long as it's an opt-in setting.
All, here are some proposed actions, pl review and advise your thoughts:
Hello guys, there is no need to revert the functionally or workaround it; we just have to understand what is happening and fix it.
The problem here is that while taking the volume snapshot, we call method copySnapshotToPrimaryStorageDir
. This method copies the base file to another directory, to allow the block commit the delta right after taking the snapshot and avoid creating chains in the primary storage. However, copying a file with backing store will also copy the reference; therefore, it is causing the observed problem when the volume is not consolidated yet (process that is done in volume migration, for instance). If we create a template from the snapshot, and the deployment destination does not have access to the backing store, a problem will happen. A simple approach here would be to use qemu-img convert
, to consolidate the volume instead of copying it with the reference. I am already working on the fix for this one.
Smoke and integrations tests indeed are needed.
Guys, I created PR #8041 to address this situation; could you have a look?
Thanks @GutoVeronezi what I've proposed isn't a workaround but to fix it correctly accordingly to our documentation - we support full snapshots for KVM. Will your PR address this concern, specifically to ensure that volume snapshots are full snapshots are not linked/dependent on templates?
The whole idea of having Primary and Secondary storage - is to decouple data on those 2, otherwise, we would have just 1 huge storage.
From the conceptual and architecture POV, the data on Secondary storage should not/may not be linked (dependent) on the data on Primary storage - that is why even XenServer incremental snapshots have the full file on the secondary storage, then chained delta files to this full one - they are still NOT dependant on the Primary Storage.
Whatever is the solution implemented, we must ensure that we do not break these basic principles - otherwise, I don't see a point of having Primary and Secondary storage as separate entities.
Also worth noting, these kinds of changes should never come without a global setting - we have introduced a change (and a serious one) which changes how something worked for 10 years, and we have not kept the backward compatibility - these kinds of changes should be implemented as optional (controllable via Global setting) and always default to the old behaviour, for backward compatibility and safety reasons.
Fixed by #8041
@shwstppr cc @DaanHoogland @GutoVeronezi thanks for getting the issue fixed and PR merged, could you review if we must do the following:
I think even with the issue fixed, there might be many users with affected snapshots which may have backing file in prod.
+1 with @rohityadavcloud , especially on 2 (add smoke tests)
do we need to create new patch release (e.g. 4.17.2.1 or 4.18.1.1) ?
Please also consider documenting how to fix problematic snapshots created in 4.17 and 4.18 (those which might be broken)
@rohityadavcloud @andrijapanicsb
Users will have a backing store in their volume snapshots in the following cases:
ROOT
volume created from template[^migrated];Users will not have a backing store in their volume snapshots in the following cases:
ROOT
volumes created with ISO;DATADISK
volumes;Following there are two queries to help users identify snapshots with a backing store:
Identify snapshots that were not removed yet and were created from a volume that was created from a template:
SELECT s.uuid AS "Snapshot ID",
s.name AS "Snapshot Name",
s.created AS "Snapshot creation datetime",
img_s.uuid AS "Sec Storage ID",
img_s.name AS "Sec Storage Name",
ssr.install_path AS "Snapshot path on Sec Storage",
v.uuid AS "Volume ID",
v.name AS "Volume Name"
FROM cloud.snapshots s
INNER JOIN cloud.volumes v ON (v.id = s.volume_id)
INNER JOIN cloud.snapshot_store_ref ssr ON (ssr.snapshot_id = s.id
AND ssr.store_role = 'Image')
INNER JOIN cloud.image_store img_s ON (img_s.id = ssr.store_id)
WHERE s.removed IS NULL
AND v.template_id IS NOT NULL;
With that, one can use qemu-img info
in the snapshot file to check if they have a backing store.
For those snapshots that have a backing store, one can use the following query to check which template is it and in which storage pool it is:
SELECT vt.uuid AS "Template ID",
vt.name AS "Template Name",
tsr.install_path AS "Template file on Pri Storage",
sp.uuid AS "Pri Storage ID",
sp.name AS "Pri Storage Name",
sp.`path` AS "Pri Storage Path",
sp.pool_type as "Pri Storage type"
FROM cloud.template_spool_ref tsr
INNER JOIN cloud.storage_pool sp ON (sp.id = tsr.pool_id AND sp.removed IS NULL)
INNER JOIN cloud.vm_template vt ON (vt.id = tsr.template_id)
WHERE tsr.install_path = "<template file in the snapshot backing store>";
After identifying the snapshots with a backing store and the related templates, one can mount the secondary storage on a host that has access to the template and use qemu-img convert
on the snapshot to consolidate it:
qemu-img convert -O qcow2 -U --image-opts driver=qcow2,file.filename=<path to snapshot on secondary storage> <path to snapshot on secondary storage>-converted
Then, one can use qemu-img info
to check the information of the consolidated file. After validating the consolidated file (e.g.: if it is booting), one can replace the original snapshot file with the consolidated file. These processes should be done carefully, in order to guarantee the health of the environment.
[^migrated]: while migrating a volume, ACS consolidates the volume. However, it depends on the type and protocol of the source and destination storage. For instance, a NFS
to NFS
migration will not consolidate the volume; on the other hand, a NFS
to SharedMountPoint
migration will consolidate the volume. Therefore, users might not see the problem with snapshots from ROOT
volumes created from templates and migrated, in some cases.
@GutoVeronezi are you looking into adding a smoke test?
@rohityadavcloud @GutoVeronezi @DaanHoogland @harikrishna-patnala should we keep this open till we can have a smoke test added and a release note change/PR?
(By all means, yes, we can not just close the issue without proper smoke tests that will ensure we don't have this critical issue that we had)
@GutoVeronezi are you looking into adding a smoke test?
@rohityadavcloud @GutoVeronezi @DaanHoogland @harikrishna-patnala should we keep this open till we can have a smoke test added and a release note change/PR?
(+1)
@GutoVeronezi are you looking into adding a smoke test?
@rohityadavcloud @GutoVeronezi @DaanHoogland @harikrishna-patnala should we keep this open till we can have a smoke test added and a release note change/PR?
@shwstppr I am planning to add them this week.
Excellent, thanks @GutoVeronezi @weizhouapache @shwstppr @andrijapanicsb for moving this forward.
@GutoVeronezi are you looking into adding a smoke test?
@rohityadavcloud @GutoVeronezi @DaanHoogland @harikrishna-patnala should we keep this open till we can have a smoke test added and a release note change/PR?
Update:
@shwstppr I am quite swamped with some other activities and I might not be able to fully work on them this week; feel free to work on them if it is your will.
PR #8104 that adds smoke test has been merged. @harikrishna-patnala cc @DaanHoogland can you please look into adding documentation so we can close this ticket?
This got closed (again :facepalm: ) with smoke test PR merge. Documentation work is remaining
Thanks @shwstppr feel free to close it if there's a doc PR etc ready.
@harikrishna-patnala were you going to compose a doc PR?
created a doc PR to add note, https://github.com/apache/cloudstack-documentation/pull/367
@shwstppr , with the doc PR open we can close this, can we?
Closing this now,
https://github.com/apache/cloudstack-documentation/pull/367 to add documentation note
@GutoVeronezi , I saw this mitigation procedures from the release note, when we have hundreds VM affected by this linked snapshot issue, I'd like to ask rather than converting them, shall we just left it as it and just told the user do not delete any affected template instead. Any potential issue if we just keep it as is and proceed the 4.19 upgrade?
@GutoVeronezi , I saw this mitigation procedures from the release note, when we have hundreds VM affected by this linked snapshot issue, I'd like to ask rather than converting them, shall we just left it as it and just told the user do not delete any affected template instead. Any potential issue if we just keep it as is and proceed the 4.19 upgrade?
Hello @levindecaro
If you will keep the snapshots as they are (so they get replaced over time), you have to bear in mind 2 points:
With that, your snapshots should work fine until they get replaced with new and consistent snapshots (on 4.19).
If you have any other doubt, just ping me.
[^cluster-wide]: for instance, if you have a snapshot of a volume in a cluster primary storage, create a template from it, and try to deploy it in another cluster, you will face errors.
@GutoVeronezi thank you for your explanation
Hit a regression/bug when creating template from a KVM volume snapshot, I think potentially the regression was introduced in https://github.com/apache/cloudstack/pull/5297/files#diff-4f8be8d5f205bb11975c14fa213c5d733abbe5da2a4c3c6bcd829d9688ebcff2L229
KVM volume snapshots are full snapshots historically but recently with above change, it seems this has been changed and is causing regressions. Steps to reproduce the issue: (My env is 4.18.1 + Ubuntu 22.04 KVM + local storage)
Further on a KVM host where the original/base template was seeded, I could deploy the VM, but upon live migrating it (with storage) to another host it creates the following: a. This is the (local storage) volume which is backing an image:
b. But the backing file (template) is tiny (193k) and has no further backing file:
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
ACS 4.18.1 Adv zone with local storage and NFS secondary storage, adv networking