Open stryan opened 2 years ago
Testing on 1.20rc1 shows this issue is mostly resolved. After upgrading both clusters and re-adding the remote I was able to ship a snapshot to the remote cluster. However I still ended up with stuck snapshots when there was an issue contacting the remote cluster. I had to restart the controller on the source cluster before being able to abort the backup and remove the backup snapshot.
Glad to hear 1.20.0rc1 works better - we did a rework in the abort mechanism to be more stable.
Would be helpful to have more details of what exactly happened when you ended up with a not abort-able backup? Was the target cluster offline from the beginning or a network-hiccup in the beginning of the backup shipment or during the data transfer?
Ah, actually looking through my scrollback the backup attempt that created phantom snapshots was due to me using "backup create" instead of "backup ship" for a Linstor to Linstor backup. It might not have been a network issue at all; I'll see if I can recreate again, but otherwise it might just need better safeguards on trying to create a backup on a linstor cluster. I will admit its a bit confusing that S3 backups and Linstor backups are both done through the backup command but work differently (i.e. backup list only works on S3 targets).
Hi all, I've setup a testing linstore environment and run into some issues with shipping snapshots to a secondary cluster. For testing I did the following steps:
backup ship remote-ls03 testgroup_vol1 testgroup_vol1_remote
This error'd out in a similar fashion to #303 where I had backups made that I couldn't delete and running
backup abort
reported success regardless of what happened.Read though 303, readded remotes with specified cluster ID's, and tried again with a seperate resource to no avail. Now I have two sets of frozen backups with the snapshots living one one node, ls01. Since I can't delete the snapshots or abort the backup I tried the following steps
node lost ls01
on the source controller, then readded itWhich has lead me to the current state where I now have two backup snapshots not attached to any node. I am unable to either abort the backup or delete the snapshots by hand. Both snapshots only show up on the source cluster and are in state "succesful" i.e:
Is there anyway for me to remove these phantom snapshots without having to reinitialize the cluster?
Enviornment: Ubuntu 22.04, Linstor installed from PPA's Linstor version: 1.19.1-1ppa1~jammy1 DRBD: 9.1.11 Source Cluster: 1 combined node and two satellite nodes, with linstor-gateway running Remote cluster: 1 combind node.