Closed simondeziel closed 2 months ago
In fact, looking at the snap225
section of the backup.yaml
(added post-recovery), it seems that during normal operations LXD doesn't save the right expires_at
field in the backup.yaml
file.
Here's a easy reproducer for the previous comment where I said the backup.yaml
didn't contain the expires_at
field:
$ lxc launch images:alpine/edge c1 -c snapshots.expiry=1d
$ lxc snapshot c1
$ sudo LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ nsenter --mount=/run/snapd/ns/lxd.mnt sed -n '/^volume_snapshots:$/,$ p' /var/snap/lxd/common/lxd/storage-pools/default/containers/c1/backup.yaml
volume_snapshots:
- name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-23T20:38:04.166424212Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 80b83e4b-482b-49a4-b83e-4e965ce51265
While clearly LXD itself is aware of the snapshot expiry:
$ lxc info c1 | sed -n '/^Snapshots:/,$ p'
Snapshots:
+-------+----------------------+----------------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+-------+----------------------+----------------------+----------+
| snap0 | 2024/08/23 16:38 EDT | 2024/08/24 16:38 EDT | NO |
+-------+----------------------+----------------------+----------+
It appears that the snapshot expiry date is correct, hence why LXD is aware of the snapshot (see below snippet). However, lxd recover
is using the volume_snapshots expiry which is zeroed. Furthermore, volume snapshot expiry is set differently than instance snapshot expiry (lxc storage volume set default container/c1 snapshots.expiry=1d
). I'm not sure what the intended behaviour is for volume snapshots expiry dates, ie. should they match up with instance snapshot expiry dates? Or, should lxd recover
be looking at the instance snapshot expiry date rather than the volume snapshot expiry date? cc. @tomponline
$ sudo LD_LIBRARY_PATH=/snap/lxd/current/lib/:/snap/lxd/current/lib/x86_64-linux-gnu/ nsenter --mount=/run/snapd/ns/lxd.mnt sed -n '/^snapshots:$/,$ p' /var/snap/lxd/common/lxd/storage-pools/default/containers/c1/backup.yaml
snapshots:
- architecture: x86_64
config:
image.architecture: amd64
image.description: Alpine edge amd64 (20240823_0018)
image.os: Alpine
image.release: edge
image.requirements.secureboot: "false"
image.serial: "20240823_0018"
image.type: squashfs
image.variant: default
snapshots.expiry: 1d
volatile.base_image: 3aab2d4b12a5bf88b798fe02cf361349cb9cd5648c89789bfac96f1cdce1d32c
volatile.cloud-init.instance-id: 5dd2a543-b181-4dce-8e0c-4202173421b7
volatile.eth0.host_name: vethdc246f41
volatile.eth0.hwaddr: 00:16:3e:9f:79:4b
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: RUNNING
volatile.uuid: ad4f73ac-50e7-4e92-87a1-82a05e928157
volatile.uuid.generation: ad4f73ac-50e7-4e92-87a1-82a05e928157
created_at: 2024-08-23T22:01:05.356755362Z
expires_at: 2024-08-24T22:01:05.353609739Z
devices: {}
ephemeral: false
expanded_config:
image.architecture: amd64
image.description: Alpine edge amd64 (20240823_0018)
image.os: Alpine
image.release: edge
image.requirements.secureboot: "false"
image.serial: "20240823_0018"
image.type: squashfs
image.variant: default
snapshots.expiry: 1d
volatile.base_image: 3aab2d4b12a5bf88b798fe02cf361349cb9cd5648c89789bfac96f1cdce1d32c
volatile.cloud-init.instance-id: 5dd2a543-b181-4dce-8e0c-4202173421b7
volatile.eth0.host_name: vethdc246f41
volatile.eth0.hwaddr: 00:16:3e:9f:79:4b
volatile.idmap.base: "0"
volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
volatile.last_state.power: RUNNING
volatile.uuid: ad4f73ac-50e7-4e92-87a1-82a05e928157
volatile.uuid.generation: ad4f73ac-50e7-4e92-87a1-82a05e928157
expanded_devices:
eth0:
name: eth0
network: lxdbr1
type: nic
root:
path: /
pool: default
type: disk
last_used_at: 0001-01-01T00:00:00Z
name: snap0
profiles:
- default
stateful: false
size: -1
pool:
name: default
description: ""
driver: zfs
status: Created
config:
size: 9GiB
source: /var/snap/lxd/common/lxd/disks/default.img
zfs.pool_name: default
used_by: []
locations:
- none
profiles:
- name: default
description: Default LXD profile
config: {}
devices:
eth0:
name: eth0
network: lxdbr1
type: nic
root:
path: /
pool: default
type: disk
used_by: []
volume:
name: c1
description: ""
type: container
pool: default
content_type: filesystem
project: default
location: none
created_at: 2024-08-23T22:01:04.354122974Z
config:
volatile.uuid: 50aa66e6-dca7-4676-94e5-ee292e098d7f
used_by: []
volume_snapshots:
- name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-23T22:01:05.356755362Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 69e6db4c-f32c-49ff-b79b-08f388a8fe9c
As found by @kadinsayani, the volume snapshot associated with snapshot'ing an instance has no expiry set:
$ lxc launch images:alpine/edge c1 -c snapshots.expiry=1d
$ lxc snapshot c1
$ lxc storage volume show default container/c1/snap0
name: snap0
description: ""
content_type: filesystem
created_at: 2024-08-26T13:36:28.408175831Z
expires_at: 0001-01-01T00:00:00Z
config:
volatile.uuid: 705af216-a8cb-4494-b8ca-dda67a8d1dd2
# or more simply
$ lxc storage volume get default container/c1/snap0 --property expires_at
0001-01-01 00:00:00 +0000 UTC
However the instance's snapshot has one:
$ lxc info c1 | sed -n '/^Snapshots:/,$ p'
Snapshots:
+-------+----------------------+----------------------+----------+
| NAME | TAKEN AT | EXPIRES AT | STATEFUL |
+-------+----------------------+----------------------+----------+
| snap0 | 2024/08/26 09:36 EDT | 2024/08/27 09:36 EDT | NO |
+-------+----------------------+----------------------+----------+
Could it be due to how snapshots are cleaned up? Maybe instance snapshots are cleaned in a different pass than volume ones?
I just went through a successful
lxd recover
which reimport my container and its snapshots from the intact zpool. Snapshots are taken on a schedule:However, after
lxd recover
brought those snapshots back, they lost theirexpires at
field:In the above,
snap225
was taken after thelxd recover
.However, the instance's
backup.yaml
should have had this information which is where it learned about thetaken at
field. That said, it seems the recovery have overwritten thebackup.yaml
with the bogus values now:Additional information: