apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
1.98k stars 1.09k forks source link

Snapshot chain - garbage collection not working as expected #9501

Open nxsbi opened 1 month ago

nxsbi commented 1 month ago
ISSUE TYPE
COMPONENT NAME
UI
CLOUDSTACK VERSION
4.19.0.1
CONFIGURATION

XCP-Ng backend

OS / ENVIRONMENT

NA

SUMMARY

In Xenserver/XCP_NG backend, when a chain of snapshots exists, the cleanup of old snapshot files is not taking place correctly. Old snapshots contnue to exist, while some in the middle are destroyed and files are deleted as well, but older ones continue to exist.

STEPS TO REPRODUCE
I have not found a sure way to recreate this issue. Here is what I have though - Setup xcp-ng backend, ver 8.2. Set snapshot.delta.max = 7, setup a Volume with Daily Snapshot and Keep =3. 
Over time, you will start seeing this discrepency
EXPECTED RESULTS
The Keep = 3 should get respected, and besides the required chain of the last 3 snapshots, all old snapshot files should physically be deleted.
ACTUAL RESULTS
old files continue to exist. 
Example - 
![image](https://github.com/user-attachments/assets/77bb64c1-af5a-4502-889f-e7e3eadfb6f3)

and Physical file view
I have Keep = 3, I expect that the Aug 6 and Aug 7 files will exist, 
I also expect that Aug 5 file will exist (keep = 3), but since Aug 5 file is part of a chain (by looking at file size), all Files from Jul 30 to Aug 5 need to be kept. 
However, Jul 3, 4 and 6 files should not exist. These files also show as "Ready" in the snapshot_store_ref above. 

![image](https://github.com/user-attachments/assets/a6e42815-a134-442d-bd15-32c839439b72)
nxsbi commented 1 month ago

Re- uploading screenshots as they did not show up as desired

I have Keep = 3, I expect that the Aug 6 and Aug 7 files will exist, I also expect that Aug 5 file will exist (keep = 3), but since Aug 5 file is part of a chain (by looking at file size), all Files from Jul 30 to Aug 5 need to be kept. However, Jul 3, 4 and 6 files should not exist. These files also show as "Ready" in the snapshot_store_ref .

So issue is - why are the snapshots for Jul5 and those between Jul 7 and Jul 29 destroyed, but not the Jul 3, 4 and 6.

PS - I did notice that the created date in the snapshot_store_ref shows as Jul 23 for many days. Not sure why...

File view: image

snapshot_store_ref view image

shwstppr commented 1 month ago

@nxsbi can you please check if this is not due to https://github.com/apache/cloudstack/issues/9446 and try 4.19.1.1 release?

nxsbi commented 1 month ago

I read through #9446 - This is a separate issue. In that one, the whole chain is getting deleted, while in my issue, files are not getting deleted.

nxsbi commented 1 week ago

@shwstppr @DaanHoogland - Is there a way to trigger the garbage collection (to clear up deleted snapshot files) manually? Can you provide info on how to trigger and monitor?

DaanHoogland commented 1 week ago

@nxsbi , I am not sure. Can you delete those snapshots from the UI?

nxsbi commented 1 week ago

@DaanHoogland - The old snapshots dont show up on the Web interface. In the UI it only shows the 3 snapshots (based on the snapshot policy setting). The Snapshot_store_ref table shows many old snapshots as "Ready".

To manually do the clean up I update the snapshots table for the ones I want to delete by setting the status = 'BackedUp' and then delete from UI. Even after doing this, the snapshot_store_ref still shows it as Ready. So I have to run a manual update on that table as well.

DaanHoogland commented 1 week ago

@nxsbi , I'm trying to reproduce this and see if there is a bug we can fix or a facility we can build to deal with the fallout. I'll let you know how I fare.