Open paprikkafox opened 1 year ago
Try to put metadata on a separate block device: https://github.com/LINBIT/linstor-server/issues/128
I use something like this:
linstor controller set-property DrbdOptions/auto-quorum disabled
linstor storage-pool create zfs px1 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create zfs px2 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create diskless px3 zfs_12
linstor resource-group create --storage-pool=zfs_12 --place-count=2 zfs_12
linstor volume-group create zfs_12
linstor sp c lvm px1 zfs_12_meta VG1
linstor sp c lvm px2 zfs_12_meta VG1
linstor sp sp px1 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02"
linstor sp sp px2 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02"
linstor rg sp zfs_12 StorPoolNameDrbdMeta zfs_12_meta
linstor rg sp zfs_12 DrbdMetaType external
linstor rg sp zfs_12 StorDriver/ZfscreateOptions "-o volblocksize=16k"
Try to put metadata on a separate block device: LINBIT/linstor-server#128
I use something like this:
linstor controller set-property DrbdOptions/auto-quorum disabled linstor storage-pool create zfs px1 zfs_12 zpool1/proxmox/drbd linstor storage-pool create zfs px2 zfs_12 zpool1/proxmox/drbd linstor storage-pool create diskless px3 zfs_12 linstor resource-group create --storage-pool=zfs_12 --place-count=2 zfs_12 linstor volume-group create zfs_12 linstor sp c lvm px1 zfs_12_meta VG1 linstor sp c lvm px2 zfs_12_meta VG1 linstor sp sp px1 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" linstor sp sp px2 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" linstor rg sp zfs_12 StorPoolNameDrbdMeta zfs_12_meta linstor rg sp zfs_12 DrbdMetaType external linstor rg sp zfs_12 StorDriver/ZfscreateOptions "-o volblocksize=16k"
I tried, but it does't help neither with volblocksize=16k or 32k. Also I should mention that pool ssd_zpool1 is a raidz-1 pool (3 disks) with defalut ashift=12, created from Proxmox Web GUI.
I have some ideas why this happened in the first place, and by checking if this is the case, I was not able to reproduce it. My best guess is that PVE is no longer that strict with sizes as long as things fit? I saw these at the end of the restore:
VM 103 (scsi0): size of disk 'ontwodistinct:pm-634e6c6c_103' updated from 1G to 1052408K
VM 103 (efidisk0): size of disk 'ontwodistinct:pm-6fbd0488_103' updated from 8152K to 5M
VM 103 (tpmstate0): size of disk 'ontwodistinct:pm-5bf812b9_103' updated from 4M to 8152K
As this issue is already pretty old and I was not longer able to reproduce it, I'm closing this. If this is still an issue with latest LINSTOR and latest linstor-proxmox plugin, feel free to re-open
I am getting the same issue. This is on my reference infrastructure build so nothing that exotic has been done with it - it's pretty vanilla. After reading the last comment I upgraded everything in the cluster to today's latest packages and tried again; same result.
Here are the versions of everything:
drbd-dkms 9.2.9-1
drbd-reactor 1.4.1-1
drbd-utils 9.28.0-1
linstor-client 1.22.1-1
linstor-common 1.27.1-1
linstor-controller 1.27.1-1
linstor-proxmox 8.0.2-1
linstor-satellite 1.27.1-1
proxmox-archive-keyring 3.0
proxmox-backup-client 3.2.2-1
proxmox-backup-file-restore 3.2.2-1
proxmox-backup-restore-image 0.6.1
proxmox-default-headers 1.0.1
proxmox-default-kernel 1.0.1
proxmox-headers-6.5 6.5.13-5
proxmox-headers-6.5.13-5-pve 6.5.13-5
proxmox-kernel-6.5 6.5.13-5
proxmox-kernel-6.5.13-5-pve-signed 6.5.13-5
proxmox-kernel-helper 8.1.0
proxmox-mail-forward 0.2.3
proxmox-mini-journalreader 1.4.0
proxmox-offline-mirror-docs 0.6.6
proxmox-offline-mirror-helper 0.6.6
proxmox-termproxy 1.0.1
proxmox-ve 8.2.0
proxmox-websocket-tunnel 0.2.0-1
proxmox-widget-toolkit 4.2.3
@rck Are you restoring to a Linstor storage target when you get these "size ... updated" messages? I see those only when I restore (successfully) to an LVM target. A Linstor target always fails.
Here is the output of a failed restore:
restore vma archive: vma extract -v -r /var/tmp/vzdumptmp31039.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp31039
CFG: size: 1180 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Mon May 20 18:11:49 2024
NOTICE
Trying to create diskful resource (pm-46e8e863) on (pve2).
new volume ID is 'essd1-r2:pm-46e8e863_108'
NOTICE
Trying to create diskful resource (pm-671cd0a6) on (pve2).
new volume ID is 'essd1-r2:pm-671cd0a6_108'
map 'drive-efidisk0' to '/dev/drbd/by-res/pm-46e8e863/0' (write zeros = 1)
map 'drive-scsi0' to '/dev/drbd/by-res/pm-671cd0a6/0' (write zeros = 1)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672
temporary volume 'essd1-r2:pm-671cd0a6_108' sucessfuly removed
temporary volume 'essd1-r2:pm-46e8e863_108' sucessfuly removed
no lock found trying to remove 'create' lock
error before or during data restore, some or all disks were not completely restored. VM 108 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && vma extract -v -r /var/tmp/vzdumptmp31039.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp31039' failed: got signal 5
And a successful restore of the same backup to an LVM target:
estore vma archive: vma extract -v -r /var/tmp/vzdumptmp35053.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp35053
CFG: size: 1180 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Mon May 20 18:11:49 2024
Rounding up size to full physical extent 4.00 MiB
Logical volume "vm-109-disk-0" created.
new volume ID is 'local-lvm:vm-109-disk-0'
Rounding up size to full physical extent 20.00 GiB
Logical volume "vm-109-disk-1" created.
new volume ID is 'local-lvm:vm-109-disk-1'
map 'drive-efidisk0' to '/dev/pve/vm-109-disk-0' (write zeros = 0)
map 'drive-scsi0' to '/dev/pve/vm-109-disk-1' (write zeros = 0)
progress 1% (read 214761472 bytes, duration 0 sec)
progress 2% (read 429522944 bytes, duration 1 sec)
.......
progress 99% (read 21260664832 bytes, duration 5 sec)
progress 100% (read 21475360768 bytes, duration 5 sec)
total bytes read 21475491840, sparse bytes 15185952768 (70.7%)
space reduction due to 4K zero blocks 3.16%
rescan volumes...
VM 109 (scsi0): size of disk 'local-lvm:vm-109-disk-1' updated from 20971528K to 20484M
VM 109 (efidisk0): size of disk 'local-lvm:vm-109-disk-0' updated from 528K to 4M
TASK OK
Update - I can reproduce this issue when restoring from a backup stored on a node's local storage, but it works fine when restoring backups stored on Proxmox Backup Server to a Linstor target.
When restoring from PBS I do see the messages rck noted:
VM 108 (efidisk0): size of disk 'essd1-r2:pm-2a7cd496_108' updated from 528K to 5128K
@arcandspark
Are you restoring to a Linstor storage target when you get these "size ... updated" messages?
yes, in my case that was a DRBD/LINSTOR disk where the backing storage was LVM, backed up to local LVM, restored to DRBD/LINSTOR with LVM as backing disks.
what type of storage (pool) do you use for the a) the VM (zfs or LVM?) and b) the backup (LVM from what I saw). Is there some LVM vs. ZFS going on?
In my case it is DRBD/LINSTOR disk backed by ZFS, backed up to local directory, restored to DRBD/LINSTOR with ZFS backing disks.
Where given the below storage config, the VM is backed up from essd1-r2
to local
, then the restore attempt is from local
to essd1-r2
root@pve1:~# zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
cssd1 928G 14.3G 914G - - 0% 1% 1.00x ONLINE -
essd1 744G 18.1G 726G - - 3% 2% 1.00x ONLINE -
root@pve1:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
cssd1 14.3G 885G 96K /cssd1
cssd1/pm-2b85e5e6_00000 1.76G 885G 1.76G -
cssd1/pm-516349e2_00000 12.5G 885G 12.5G -
essd1 18.1G 703G 24K /essd1
essd1/pm-2bd8d383_00000 2.41G 703G 2.41G -
essd1/pm-444ddb3a_00000 2.41G 703G 2.41G -
essd1/pm-4cd625c1_00000 33.5K 703G 33.5K -
essd1/pm-72c37a62_00000 2.38G 703G 2.38G -
essd1/pm-7f2ec680_00000 3.02G 703G 3.02G -
essd1/pm-91d8ee35_00000 2.40G 703G 2.40G -
essd1/pm-a0a0ed58_00000 1.82G 703G 1.82G -
essd1/pm-a75b98dc_00000 31.5K 703G 31.5K -
essd1/pm-b9f1a565_00000 33K 703G 33K -
essd1/pm-c29564a5_00000 33.5K 703G 33.5K -
essd1/pm-c4666574_00000 32.5K 703G 32.5K -
essd1/pm-c7f922a5_00000 33.5K 703G 33.5K -
essd1/pm-e6b468ee_00000 30K 703G 30K -
essd1/pm-f0b4a3b2_00000 3.58G 703G 3.58G -
essd1/vm-100-cloudinit_00000 17K 703G 17K -
essd1/vm-101-cloudinit_00000 17K 703G 17K -
essd1/vm-103-cloudinit_00000 17K 703G 17K -
root@pve1:~# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content backup,vztmpl,iso
lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir
drbd: essd1-r2
resourcegroup essd1-r2
apica /etc/linstor/ssl/controller-api.pem
apicrt /etc/linstor/ssl/client-cert.pem
apikey /etc/linstor/ssl/client-key.pem
content rootdir,images
controller pve1,pve2,pve3
drbd: cssd1-r2
resourcegroup cssd1-r2
apica /etc/linstor/ssl/controller-api.pem
apicrt /etc/linstor/ssl/client-cert.pem
apikey /etc/linstor/ssl/client-key.pem
content images,rootdir
controller pve1,pve2,pve3
pbs: backup1
datastore backup1
server pbs.dev
content backup
fingerprint EC:7C:89:DC:2D:B1:02:D2:93:64:12:DB:F3:7B:DA:90:17:BD:9A:37:47:5B:22:B2:CE:2A:B2:50:89:68:2A:8E
username root@pam!pve
root@pve1:~# linstor sp list
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool ┊ Node ┊ Driver ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ pve1 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ pve1;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ pve2 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ pve2;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ pve3 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ pve3;DfltDisklessStorPool ┊
┊ cssd1 ┊ pve1 ┊ ZFS_THIN ┊ cssd1 ┊ 885.00 GiB ┊ 928 GiB ┊ True ┊ Ok ┊ pve1;cssd1 ┊
┊ cssd1 ┊ pve2 ┊ ZFS_THIN ┊ cssd1 ┊ 897.49 GiB ┊ 928 GiB ┊ True ┊ Ok ┊ pve2;cssd1 ┊
┊ cssd1 ┊ pve3 ┊ ZFS_THIN ┊ cssd1 ┊ 886.75 GiB ┊ 928 GiB ┊ True ┊ Ok ┊ pve3;cssd1 ┊
┊ data ┊ pve1 ┊ LVM_THIN ┊ pve/data ┊ 611.67 GiB ┊ 611.73 GiB ┊ True ┊ Ok ┊ pve1;data ┊
┊ data ┊ pve2 ┊ LVM_THIN ┊ pve/data ┊ 141.17 GiB ┊ 141.23 GiB ┊ True ┊ Ok ┊ pve2;data ┊
┊ data ┊ pve3 ┊ LVM_THIN ┊ pve/data ┊ 141.23 GiB ┊ 141.23 GiB ┊ True ┊ Ok ┊ pve3;data ┊
┊ essd1 ┊ pve1 ┊ ZFS_THIN ┊ essd1 ┊ 702.78 GiB ┊ 744 GiB ┊ True ┊ Ok ┊ pve1;essd1 ┊
┊ essd1 ┊ pve2 ┊ ZFS_THIN ┊ essd1 ┊ 710.32 GiB ┊ 744 GiB ┊ True ┊ Ok ┊ pve2;essd1 ┊
┊ essd1 ┊ pve3 ┊ ZFS_THIN ┊ essd1 ┊ 710.36 GiB ┊ 744 GiB ┊ True ┊ Ok ┊ pve3;essd1 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
thank you for the very detailed and helpful logs, and sorry this took a bit longer for a response... I have an idea and will try to reproduce it in my dev env.
"unfortunately" I still can not reproduce this. First I thought it might be a (block)size issue between zfs and lvm, but backup+restore worked as expected. then I thought it might be the EFI disk, but still:
progress 99% (read 2126053376 bytes, duration 15 sec)
progress 100% (read 2147483648 bytes, duration 15 sec)
total bytes read 2147549184, sparse bytes 2060103680 (95.9%)
space reduction due to 4K zero blocks 2.22%
rescan volumes...
VM 105 (efidisk0): size of disk 'tank:pm-906af5bd_105' updated from 5128K to 5M
TASK OK
@arcandspark can you reproduce it with a fresh dummy VM, no efi disk, no "funny things" like snapshots or resizing. create, backup, restore. still failing?
I prepared two identically installed Debian 12 VMs, except one is a Q35/SeaBIOS VM and the other is a Q35/OVMF EFI VM. Each VM has a single disk on the Linstor pool (essd-r2, backed by ZFS). The EFI VM has an EFI disk also on the Linstor pool (essd-r2). I created a backup of each to the local LVM storage of the same node. I then restored each on that node from the backup.
The BIOS VM succeeded, the EFI VM failed when creating the EFI disk:
BIOS VM Restore Task:
restore vma archive: vma extract -v -r /var/tmp/vzdumptmp1726470.fifo /var/lib/vz/dump/vzdump-qemu-106-2024_06_06-14_02_16.vma /var/tmp/vzdumptmp1726470
CFG: size: 524 name: qemu-server.conf
DEV: dev_id=1 size: 21474844672 devname: drive-scsi0
CTIME: Thu Jun 6 14:02:21 2024
NOTICE
Trying to create diskful resource (pm-fa737626) on (pve1).
new volume ID is 'essd1-r2:pm-fa737626_106'
map 'drive-scsi0' to '/dev/drbd/by-res/pm-fa737626/0' (write zeros = 1)
progress 1% (read 214761472 bytes, duration 1 sec)
... ... ...
progress 100% (read 21474836480 bytes, duration 104 sec)
total bytes read 21474902016, sparse bytes 17488576512 (81.4%)
space reduction due to 4K zero blocks 3.94%
rescan volumes...
VM 106 (scsi0): size of disk 'essd1-r2:pm-fa737626_106' updated from 20G to 20971528K
TASK OK
EFI VM Restore Task:
restore vma archive: vma extract -v -r /var/tmp/vzdumptmp1729754.fifo /var/lib/vz/dump/vzdump-qemu-108-2024_06_11-14_16_32.vma /var/tmp/vzdumptmp1729754
CFG: size: 625 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Tue Jun 11 14:16:35 2024
NOTICE
Trying to create diskful resource (pm-e5762572) on (pve1).
new volume ID is 'essd1-r2:pm-e5762572_109'
NOTICE
Trying to create diskful resource (pm-2aaf4174) on (pve1).
new volume ID is 'essd1-r2:pm-2aaf4174_109'
map 'drive-efidisk0' to '/dev/drbd/by-res/pm-e5762572/0' (write zeros = 1)
map 'drive-scsi0' to '/dev/drbd/by-res/pm-2aaf4174/0' (write zeros = 1)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672
temporary volume 'essd1-r2:pm-e5762572_109' sucessfuly removed
temporary volume 'essd1-r2:pm-2aaf4174_109' sucessfuly removed
no lock found trying to remove 'create' lock
error before or during data restore, some or all disks were not completely restored. VM 109 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && vma extract -v -r /var/tmp/vzdumptmp1729754.fifo /var/lib/vz/dump/vzdump-qemu-108-2024_06_11-14_16_32.vma /var/tmp/vzdumptmp1729754' failed: got signal 5
once more thanks for the detailed info. I always tested with alpine images, I did check the EFI disk box... whatever did the trick, debian or q35, I now could reproduce it. the problem is:
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
these are bytes, so that makes 528K.
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672
5251072 bytes are exactly 5M. DRBD devices have a lower limit, and 5M looked like a good lower limit to me. then add the usual rounding from LINSTOR, different block sizes, obscure vma
behavior and you are there. it looks like this strict size check only triggers if the disk has a certain size. if I use a lower minimum size (3M instead of 5M) then things work. I will have to think about a proper fix. but the problem is now fully understood, thanks.
this should be fixe in https://github.com/LINBIT/linstor-proxmox/commit/2dfcc496de4316ee89248edc91d7b0e18faec3ec . @arcandspark can you confirm this fixes the issue for you. just replace the file/the line and maybe systemctl restart pvedaemon
.
@arcandspark did you have a chance to test the proposed fix?
Environment:
Software Versions:
Proxmox Plugin config:
Problem:
When I try to create then restore backup of VM with TPM2.0 and EFI storage enabled im getting error about different disk sizes for EFI disk (used to store EFI vars)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5242880 != 540672
Full restore log:
I think the problem is somehow tied to the work of ZFS thin-provisoning and related functionality in Linstor, tell me please, maybe I'm doing something wrong