LINBIT / linstor-proxmox

Integration pluging bridging LINSTOR to Proxmox VE
31 stars 7 forks source link

Cannot create VM with EFI disk #43

Closed Foxi352 closed 3 years ago

Foxi352 commented 3 years ago

I am new to Linstor and playing with it in my home-lab. I was moving all VM's from old storage to new Linstor storage. During this i remarked that all VM's with UEFI Bios failed to move while trying to move the EFI disk. BIOS VM's were fine. So i tried to delete the EFI disk and move the VM. This worked but i was then unable to add a new EFI disk. Moving the VM to local storage made it possible again to add a new EFI disk and boot the machine.

For testing i then simply tried to create a new UEFI Linux VM on DRBD storage. This also failed as soon as it tries to create the EFI disk. So it seems linstor-proxmox plugin can not create / move an EFI disk to DRBD storage.

Error: Verification of resource file failed","details":"The error reported by the runtime environment or operating system is:\nThe external command 'drbdadm' exited with error code 10

Proxmox: 6.4-9 DRBD: version: 9.1.2 (api:2/proto:110-120)

Linstor:

ii  linstor-client                       1.8.0-1                       all          Linstor client command line tool
ii  linstor-common                       1.13.0-1                      all          DRBD distributed resource management utility
ii  linstor-controller                   1.13.0-1                      all          DRBD distributed resource management utility
ii  linstor-proxmox                      5.1.6-1                       all          DRBD distributed resource management utility
ii  linstor-satellite                    1.13.0-1                      all          DRBD distributed resource management utility
ii  python-linstor                       1.8.0-1                       all          Linstor python api library

Full error message:

TASK ERROR: unable to create VM 106 - API Return-Code: 500. Message: Could not create resource definition vm-106-disk-1 from resource group hdd_group, because: [{"ret_code":19922945,"message":"Volume definition with number '0' successfully  created in resource definition 'vm-106-disk-1'.","obj_refs":{"RscGrp":"hdd_group","RscDfn":"vm-106-disk-1","VlmNr":"0"}},{"ret_code":20447233,"message":"New resource definition 'vm-106-disk-1' created.","details":"Resource definition 'vm-106-disk-1' UUID is: 39d65374-8bf8-49fa-b790-7a2ed795da06","obj_refs":{"RscGrp":"hdd_group","UUID":"39d65374-8bf8-49fa-b790-7a2ed795da06","RscDfn":"vm-106-disk-1"}},{"ret_code":20185089,"message":"Successfully set property key(s): StorPoolName","obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}},{"ret_code":20185089,"message":"Successfully set property key(s): StorPoolName","obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}},{"ret_code":21233665,"message":"Resource 'vm-106-disk-1' successfully autoplaced on 1 nodes","details":"Used nodes (storage pool name): 'pve2 (hdd_pool)'","obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}},{"ret_code":4611686018448621568,"message":"Updated DRBD auto verify algorithm to 'crct10dif-pclmul'","obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}},{"ret_code":21233667,"message":"Created resource 'vm-106-disk-1' on 'pve'","obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}},{"ret_code":-4611686018406153242,"message":"(Node: 'pve2') Generated resource file for resource 'vm-106-disk-1' is invalid.","cause":"Verification of resource file failed","details":"The error reported by the runtime environment or operating system is:\nThe external command 'drbdadm' exited with error code 10\n","error_report_ids":["60D3128F-F21CA-000074"],"obj_refs":{"RscDfn":"vm-106-disk-1","RscGrp":"hdd_group"}}]  at /usr/share/perl5/PVE/Storage/Custom/LINSTORPlugin.pm line 300.   PVE::Storage::Custom::LINSTORPlugin::alloc_image("PVE::Storage::Custom::LINSTORPlugin", "hdd_group", HASH(0x5565a110e458), 106, "raw", undef, 33554432) called at /usr/share/perl5/PVE/Storage.pm line 896  eval {...} called at /usr/share/perl5/PVE/Storage.pm line 896   PVE::Storage::__ANON__() called at /usr/share/perl5/PVE/Cluster.pm line 621     eval {...} called at /usr/share/perl5/PVE/Cluster.pm line 587   PVE::Cluster::__ANON__("storage-hdd_group", undef, CODE(0x5565a110e970)) called at /usr/share/perl5/PVE/Cluster.pm line 666     PVE::Cluster::cfs_lock_storage("hdd_group", undef, CODE(0x5565a110e970)) called at /usr/share/perl5/PVE/Storage/Plugin.pm line 478  PVE::Storage::Plugin::cluster_lock_storage("PVE::Storage::Custom::LINSTORPlugin", "hdd_group", 1, undef, CODE(0x5565a110e970)) called at /usr/share/perl5/PVE/Storage.pm line 901   PVE::Storage::vdisk_alloc(HASH(0x5565a11574c0), "hdd_group", 106, "raw", undef, 33554432) called at /usr/share/perl5/PVE/API2/Qemu.pm line 188  PVE::API2::Qemu::__ANON__("scsi0", HASH(0x5565a116bb20)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 475  PVE::AbstractConfig::foreach_volume_full("PVE::QemuConfig", HASH(0x5565a1197c50), undef, CODE(0x5565a0890fd8)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 484    PVE::AbstractConfig::foreach_volume("PVE::QemuConfig", HASH(0x5565a1197c50), CODE(0x5565a0890fd8)) called at /usr/share/perl5/PVE/API2/Qemu.pm line 221     eval {...} called at /usr/share/perl5/PVE/API2/Qemu.pm line 221     PVE::API2::Qemu::__ANON__(PVE::RPCEnvironment=HASH(0x556599e84710), "root\@pam", HASH(0x5565a1197c50), "x86_64", HASH(0x5565a11574c0), 106, undef, HASH(0x5565a1197c50), ...) called at /usr/share/perl5/PVE/API2/Qemu.pm line 707  eval {...} called at /usr/share/perl5/PVE/API2/Qemu.pm line 706     PVE::API2::Qemu::__ANON__() called at /usr/share/perl5/PVE/AbstractConfig.pm line 299   PVE::AbstractConfig::__ANON__() called at /usr/share/perl5/PVE/Tools.pm line 220    eval {...} called at /usr/share/perl5/PVE/Tools.pm line 220     PVE::Tools::lock_file_full("/var/lock/qemu-server/lock-106.conf", 1, 0, CODE(0x556599e84470)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 302     PVE::AbstractConfig::__ANON__("PVE::QemuConfig", 106, 1, 0, CODE(0x55659a8ad5f0)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 322     PVE::AbstractConfig::lock_config_full("PVE::QemuConfig", 106, 1, CODE(0x55659a8ad5f0)) called at /usr/share/perl5/PVE/API2/Qemu.pm line 747     PVE::API2::Qemu::__ANON__() called at /usr/share/perl5/PVE/API2/Qemu.pm line 777    eval {...} called at /usr/share/perl5/PVE/API2/Qemu.pm line 777     PVE::API2::Qemu::__ANON__("UPID:pve2:0000996A:05114240:60E00B74:qmcreate:106:root\@pam:") called at /usr/share/perl5/PVE/RESTEnvironment.pm line 615    eval {...} called at /usr/share/perl5/PVE/RESTEnvironment.pm line 606   PVE::RESTEnvironment::fork_worker(PVE::RPCEnvironment=HASH(0x556599e84710), "qmcreate", 106, "root\@pam", CODE(0x5565a119e778)) called at /usr/share/perl5/PVE/API2/Qemu.pm line 789    PVE::API2::Qemu::__ANON__(HASH(0x5565a1197c50)) called at /usr/share/perl5/PVE/RESTHandler.pm line 452  PVE::RESTHandler::handle("PVE::API2::Qemu", HASH(0x55659ece9e80), HASH(0x5565a1197c50)) called at /usr/share/perl5/PVE/HTTPServer.pm line 178   eval {...} called at /usr/share/perl5/PVE/HTTPServer.pm line 139    PVE::HTTPServer::rest_handler(PVE::HTTPServer=HASH(0x556599e847d0), "::ffff:192.168.9.1", "POST", "/nodes/pve2/qemu", HASH(0x5565a110f3c0), HASH(0x5565a11a1ec0), "extjs") called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 877    eval {...} called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 851    PVE::APIServer::AnyEvent::handle_api2_request(PVE::HTTPServer=HASH(0x556599e847d0), HASH(0x5565a110f0a8), HASH(0x5565a110f3c0), "POST", "/api2/extjs/nodes/pve2/qemu") called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1101   eval {...} called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1093   PVE::APIServer::AnyEvent::handle_request(PVE::HTTPServer=HASH(0x556599e847d0), HASH(0x5565a110f0a8), HASH(0x5565a110f3c0), "POST", "/api2/extjs/nodes/pve2/qemu") called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1500    PVE::APIServer::AnyEvent::__ANON__(AnyEvent::Handle=HASH(0x5565a0891560), "memory=2048&ostype=l26&scsihw=virtio-scsi-pci&scsi0=hdd_group"...) called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent/Handle.pm line 1505   AnyEvent::Handle::__ANON__(AnyEvent::Handle=HASH(0x5565a0891560)) called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent/Handle.pm line 1315   AnyEvent::Handle::_drain_rbuf(AnyEvent::Handle=HASH(0x5565a0891560)) called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent/Handle.pm line 2015    AnyEvent::Handle::__ANON__() called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent/Loop.pm line 248   AnyEvent::Loop::one_event() called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent/Impl/Perl.pm line 46    AnyEvent::CondVar::Base::_wait(AnyEvent::CondVar=HASH(0x5565a08c0db8)) called at /usr/lib/x86_64-linux-gnu/perl5/5.28/AnyEvent.pm line 2026     AnyEvent::CondVar::Base::recv(AnyEvent::CondVar=HASH(0x5565a08c0db8)) called at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1814    PVE::APIServer::AnyEvent::run(PVE::HTTPServer=HASH(0x556599e847d0)) called at /usr/share/perl5/PVE/Service/pvedaemon.pm line 52     PVE::Service::pvedaemon::run(PVE::Service::pvedaemon=HASH(0x5565a10f61d0)) called at /usr/share/perl5/PVE/Daemon.pm line 171    eval {...} called at /usr/share/perl5/PVE/Daemon.pm line 171    PVE::Daemon::__ANON__(PVE::Service::pvedaemon=HASH(0x5565a10f61d0)) called at /usr/share/perl5/PVE/Daemon.pm line 391   eval {...} called at /usr/share/perl5/PVE/Daemon.pm line 380    PVE::Daemon::__ANON__(PVE::Service::pvedaemon=HASH(0x5565a10f61d0), undef) called at /usr/share/perl5/PVE/Daemon.pm line 552    eval {...} called at /usr/share/perl5/PVE/Daemon.pm line 550    PVE::Daemon::start(PVE::Service::pvedaemon=HASH(0x5565a10f61d0), undef) called at /usr/share/perl5/PVE/Daemon.pm line 661   PVE::Daemon::__ANON__(HASH(0x556599e7a030)) called at /usr/share/perl5/PVE/RESTHandler.pm line 452  PVE::RESTHandler::handle("PVE::Service::pvedaemon", HASH(0x5565a10f6518), HASH(0x556599e7a030), 1) called at /usr/share/perl5/PVE/RESTHandler.pm line 864   eval {...} called at /usr/share/perl5/PVE/RESTHandler.pm line 847   PVE::RESTHandler::cli_handler("PVE::Service::pvedaemon", "pvedaemon start", "start", ARRAY(0x55659a1af110), ARRAY(0x5565a10f6bc0), undef, undef, undef) called at /usr/share/perl5/PVE/CLIHandler.pm line 591   PVE::CLIHandler::__ANON__(ARRAY(0x556599e7a258), CODE(0x55659a1f7ad0), undef) called at /usr/share/perl5/PVE/CLIHandler.pm line 668     PVE::CLIHandler::run_cli_handler("PVE::Service::pvedaemon", "prepare", CODE(0x55659a1f7ad0)) called at /usr/bin/pvedaemon line 27
acidrop commented 3 years ago

@Foxi352 What type of storage backend are you using for Linstor ? LVM or ZFS ? I managed to reproduce the same, but only when using ZFS as a backend. When using LVM (thin) the EFI disk is created correctly.

Can you post also the equivalent Linstor error report (linstor err s report_id) ?

Here's what I get on mine...

Caused by:
==========

Description:
    Execution of the external command 'drbdadm' failed.
Cause:
    The external command exited with error code 1.
Correction:
    - Check whether the external program is operating properly.
    - Check whether the command line is correct.
      Contact a system administrator or a developer if the command line is no longer valid
      for the installed version of the external program.
Additional information:
    The full command line executed was:
    drbdadm -vvv adjust vm-107-disk-1

    The external command sent the following output data:
    drbdmeta 1005 v09 /dev/zvol/drbdpool/vm-107-disk-1_00000 internal apply-al
    drbdsetup attach 1005 /dev/zvol/drbdpool/vm-107-disk-1_00000 /dev/zvol/drbdpool/vm-107-disk-1_00000 internal --al-extents=1024 --discard-zeroes-if-aligned=yes --read-balancing=round-robin --rs-discard-granularity=8192

    The external command sent the following error information:
     [ne] minor 1005 (vol:0) /dev/zvol/drbdpool/vm-107-disk-1_00000 missing from kernel
    1005: Failure: (112) Meta device too small.
    Command 'drbdsetup attach 1005 /dev/zvol/drbdpool/vm-107-disk-1_00000 /dev/zvol/drbdpool/vm-107-disk-1_00000 internal --al-extents=1024 --discard-zeroes-if-aligned=yes --read-balancing=round-robin --rs-discard-granularity=8192' terminated with exit code 10
Foxi352 commented 3 years ago

It is indeed a zfs-thin on a raid-z1

hdd_pool ┊ pve2 ┊ ZFS_THIN ┊ hdd ┊ 1.91 TiB ┊ 3.62 TiB ┊ True ┊ Ok

ERROR REPORT 60D3128F-F21CA-000082

============================================================

Application:                        LINBIT® LINSTOR
Module:                             Satellite
Version:                            1.13.0
Build ID:                           37c02e20aa52f26ef28ce4464925d9e53327171c
Build time:                         2021-06-21T06:45:49+00:00
Error time:                         2021-07-03 10:51:26
Node:                               pve2

============================================================

Reported error:
===============

Description:
    Operations on resource 'vm-106-disk-1' were aborted
Cause:
    Verification of resource file failed
Additional information:
    The error reported by the runtime environment or operating system is:
    The external command 'drbdadm' exited with error code 10

Category:                           LinStorException
Class name:                         StorageException
Class canonical name:               com.linbit.linstor.storage.StorageException
Generated at:                       Method 'regenerateResFile', Source file 'DrbdLayer.java', Line #1458

Error message:                      Generated resource file for resource 'vm-106-disk-1' is invalid.

Error context:
    An error occurred while processing resource 'Node: 'pve2', Rsc: 'vm-106-disk-1''

Call backtrace:

    Method                                   Native Class:Line number
    regenerateResFile                        N      com.linbit.linstor.layer.drbd.DrbdLayer:1458
    adjustDrbd                               N      com.linbit.linstor.layer.drbd.DrbdLayer:630
    process                                  N      com.linbit.linstor.layer.drbd.DrbdLayer:389
    process                                  N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:815
    processResourcesAndSnapshots             N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:355
    dispatchResources                        N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:165
    dispatchResources                        N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:297
    phaseDispatchDeviceHandlers              N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:1035
    devMgrLoop                               N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:702
    run                                      N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:599
    run                                      N      java.lang.Thread:829

Caused by:
==========

Description:
    Execution of the external command 'drbdadm' failed.
Cause:
    The external command exited with error code 10.
Correction:
    - Check whether the external program is operating properly.
    - Check whether the command line is correct.
      Contact a system administrator or a developer if the command line is no longer valid
      for the installed version of the external program.
Additional information:
    The full command line executed was:
    drbdadm --config-to-test /var/lib/linstor.d/vm-106-disk-1.res_tmp --config-to-exclude /var/lib/linstor.d/vm-106-disk-1.res sh-nop

    The external command sent the following output data:

    The external command sent the following error information:
    /var/lib/linstor.d/vm-106-disk-1.res_tmp:0: conflicting use of device-minor 'device-minor:pve2:1038' ...
    /var/lib/linstor.d/snap_vm-201-disk-1_vzdump.res:0: device-minor 'device-minor:pve2:1038' first used here.

Category:                           LinStorException
Class name:                         ExtCmdFailedException
Class canonical name:               com.linbit.extproc.ExtCmdFailedException
Generated at:                       Method 'execute', Source file 'DrbdAdm.java', Line #556

Error message:                      The external command 'drbdadm' exited with error code 10

Call backtrace:

    Method                                   Native Class:Line number
    execute                                  N      com.linbit.linstor.layer.drbd.utils.DrbdAdm:556
    execute                                  N      com.linbit.linstor.layer.drbd.utils.DrbdAdm:542
    checkResFile                             N      com.linbit.linstor.layer.drbd.utils.DrbdAdm:419
    regenerateResFile                        N      com.linbit.linstor.layer.drbd.DrbdLayer:1451
    adjustDrbd                               N      com.linbit.linstor.layer.drbd.DrbdLayer:630
    process                                  N      com.linbit.linstor.layer.drbd.DrbdLayer:389
    process                                  N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:815
    processResourcesAndSnapshots             N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:355
    dispatchResources                        N      com.linbit.linstor.core.devmgr.DeviceHandlerImpl:165
    dispatchResources                        N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:297
    phaseDispatchDeviceHandlers              N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:1035
    devMgrLoop                               N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:702
    run                                      N      com.linbit.linstor.core.devmgr.DeviceManagerImpl:599
    run                                      N      java.lang.Thread:829

END OF ERROR REPORT.
ghernadi commented 3 years ago

Can you show us please a linstor volume-definition list?

Foxi352 commented 3 years ago

Of course. Note: I did not yet clean up the mess it created. Some of the disks are orphaned like the vm-106 which failed creation but the volume exists. Or like vm-107 which was a windows 2019 server with UEFI. I tried multiple times and it is now on a local volume, but the 1-6 from my tries are still here.

I have to take some time to clean that up properly and carefully without destroying a working vm ...

╭──────────────────────────────────────────────────────────────────╮
┊ ResourceName  ┊ VolumeNr ┊ VolumeMinor ┊ Size    ┊ Gross ┊ State ┊
╞══════════════════════════════════════════════════════════════════╡
┊ vm-104-disk-1 ┊ 0        ┊ 1018        ┊ 100 GiB ┊       ┊ ok    ┊
┊ vm-104-disk-2 ┊ 0        ┊ 1025        ┊ 100 GiB ┊       ┊ ok    ┊
┊ vm-105-disk-1 ┊ 0        ┊ 1007        ┊ 100 GiB ┊       ┊ ok    ┊
┊ vm-106-disk-1 ┊ 0        ┊ 1038        ┊ 32 GiB  ┊       ┊ ok    ┊
┊ vm-107-disk-2 ┊ 0        ┊ 1002        ┊ 128 KiB ┊       ┊ ok    ┊
┊ vm-107-disk-3 ┊ 0        ┊ 1003        ┊ 128 KiB ┊       ┊ ok    ┊
┊ vm-107-disk-4 ┊ 0        ┊ 1008        ┊ 128 KiB ┊       ┊ ok    ┊
┊ vm-107-disk-5 ┊ 0        ┊ 1009        ┊ 128 KiB ┊       ┊ ok    ┊
┊ vm-107-disk-6 ┊ 0        ┊ 1010        ┊ 128 KiB ┊       ┊ ok    ┊
┊ vm-108-disk-1 ┊ 0        ┊ 1006        ┊ 75 GiB  ┊       ┊ ok    ┊
┊ vm-108-disk-2 ┊ 0        ┊ 1012        ┊ 75 GiB  ┊       ┊ ok    ┊
┊ vm-108-disk-3 ┊ 0        ┊ 1026        ┊ 75 GiB  ┊       ┊ ok    ┊
┊ vm-109-disk-1 ┊ 0        ┊ 1028        ┊ 32 GiB  ┊       ┊ ok    ┊
┊ vm-110-disk-1 ┊ 0        ┊ 1019        ┊ 110 GiB ┊       ┊ ok    ┊
┊ vm-111-disk-1 ┊ 0        ┊ 1000        ┊ 50 GiB  ┊       ┊ ok    ┊
┊ vm-111-disk-2 ┊ 0        ┊ 1001        ┊ 50 GiB  ┊       ┊ ok    ┊
┊ vm-111-disk-3 ┊ 0        ┊ 1004        ┊ 50 GiB  ┊       ┊ ok    ┊
┊ vm-200-disk-1 ┊ 0        ┊ 1033        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-201-disk-1 ┊ 0        ┊ 1030        ┊ 10 GiB  ┊       ┊ ok    ┊
┊ vm-202-disk-2 ┊ 0        ┊ 1011        ┊ 8 GiB   ┊       ┊ ok    ┊
┊ vm-203-disk-1 ┊ 0        ┊ 1022        ┊ 30 GiB  ┊       ┊ ok    ┊
┊ vm-204-disk-1 ┊ 0        ┊ 1021        ┊ 8 GiB   ┊       ┊ ok    ┊
┊ vm-205-disk-1 ┊ 0        ┊ 1031        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-205-disk-2 ┊ 0        ┊ 1029        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-205-disk-3 ┊ 0        ┊ 1032        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-206-disk-1 ┊ 0        ┊ 1036        ┊ 300 GiB ┊       ┊ ok    ┊
┊ vm-208-disk-1 ┊ 0        ┊ 1034        ┊ 8 GiB   ┊       ┊ ok    ┊
┊ vm-208-disk-2 ┊ 0        ┊ 1035        ┊ 8 GiB   ┊       ┊ ok    ┊
┊ vm-209-disk-1 ┊ 0        ┊ 1027        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-210-disk-1 ┊ 0        ┊ 1020        ┊ 20 GiB  ┊       ┊ ok    ┊
┊ vm-212-disk-1 ┊ 0        ┊ 1017        ┊ 50 GiB  ┊       ┊ ok    ┊
┊ vm-214-disk-1 ┊ 0        ┊ 1015        ┊ 20 GiB  ┊       ┊ ok    ┊
┊ vm-214-disk-2 ┊ 0        ┊ 1016        ┊ 20 GiB  ┊       ┊ ok    ┊
┊ vm-215-disk-1 ┊ 0        ┊ 1037        ┊ 8 GiB   ┊       ┊ ok    ┊
┊ vm-216-disk-1 ┊ 0        ┊ 1013        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-218-disk-1 ┊ 0        ┊ 1023        ┊ 20 GiB  ┊       ┊ ok    ┊
┊ vm-218-disk-2 ┊ 0        ┊ 1024        ┊ 20 GiB  ┊       ┊ ok    ┊
┊ vm-220-disk-1 ┊ 0        ┊ 1014        ┊ 15 GiB  ┊       ┊ ok    ┊
┊ vm-230-disk-1 ┊ 0        ┊ 1005        ┊ 100 GiB ┊       ┊ ok    ┊
╰──────────────────────────────────────────────────────────────────╯

I also add you the linstor r l part for the vm-106 from my original post which failed creation.

┊ vm-106-disk-1 ┊ pve  ┊ 7038 ┊ Unused ┊ Connecting(pve2) ┊     Diskless ┊ 2021-07-03 09:02:13 ┊
┊ vm-106-disk-1 ┊ pve2 ┊ 7038 ┊        ┊                  ┊      Unknown ┊                     ┊
ghernadi commented 3 years ago

Did you manually create some DRBD resources?

    The external command sent the following error information:
    /var/lib/linstor.d/vm-106-disk-1.res_tmp:0: conflicting use of device-minor 'device-minor:pve2:1038' ...
    /var/lib/linstor.d/snap_vm-201-disk-1_vzdump.res:0: device-minor 'device-minor:pve2:1038' first used here.

I see that vm-106 has the minor number 1038 reserved:

┊ vm-106-disk-1 ┊ 0        ┊ 1038        ┊ 32 GiB  ┊       ┊ ok    ┊

But I cannot see snap_vm-201 in your linstor vd l. Additionally snap_* sounds like a snapshot, which should per definition not be an active DRBD resource. In case that was a snapshot and restored into a new resource, that new resource should be showing up in vd l...

@acidrop Does /dev/zvol/drbdpool/vm-107-disk-1_00000 exist on your setup (in case you are still having this issue)? What does zfs list show regarding this volume? If the device exists, how large is it / how large should it be according to Linstor?

Foxi352 commented 3 years ago

No, i did not manually create DRBD resources. I did create the zfs pool and the linstor store pool. From there on i did everything with the proxmox plugin only afair. I do remember that i tried to move some of the vm's where Proxmox immediately complained that it has existing snapshots. I then deleted the snapshots and the vm moved without problems. I don't think that had an influence on drbd ?

Here is the zfs list:

hdd                       665G  1.90T      140K  /hdd
hdd/vm-104-disk-1_00000   122K  1.90T      122K  -
hdd/vm-104-disk-2_00000  46.0G  1.90T     46.0G  -
hdd/vm-105-disk-1_00000  3.19G  1.90T     3.19G  -
hdd/vm-106-disk-1_00000  81.4K  1.90T     81.4K  -
hdd/vm-108-disk-1_00000   122K  1.90T      122K  -
hdd/vm-108-disk-2_00000   122K  1.90T      122K  -
hdd/vm-108-disk-3_00000  84.3G  1.90T     84.3G  -
hdd/vm-109-disk-1_00000  13.9G  1.90T     13.9G  -
hdd/vm-110-disk-1_00000   121G  1.90T      121G  -
hdd/vm-111-disk-1_00000   122K  1.90T      122K  -
hdd/vm-111-disk-2_00000   122K  1.90T      122K  -
hdd/vm-111-disk-3_00000  44.0G  1.90T     44.0G  -
hdd/vm-200-disk-1_00000  5.66G  1.90T     5.66G  -
hdd/vm-201-disk-1_00000  3.62G  1.90T     3.57G  -
hdd/vm-202-disk-1_00000  2.77G  1.90T     2.77G  -
hdd/vm-202-disk-2_00000  3.75G  1.90T     3.75G  -
hdd/vm-203-disk-1_00000  30.7G  1.90T     30.7G  -
hdd/vm-204-disk-1_00000  5.14G  1.90T     5.14G  -
hdd/vm-205-disk-1_00000   110K  1.90T      110K  -
hdd/vm-205-disk-2_00000  81.4K  1.90T     81.4K  -
hdd/vm-205-disk-3_00000  10.6G  1.90T     10.6G  -
hdd/vm-206-disk-1_00000   167G  1.90T      167G  -
hdd/vm-208-disk-1_00000   110K  1.90T      110K  -
hdd/vm-208-disk-2_00000  2.09G  1.90T     2.09G  -
hdd/vm-209-disk-1_00000  16.4G  1.90T     16.4G  -
hdd/vm-210-disk-1_00000  2.76G  1.90T     2.76G  -
hdd/vm-212-disk-1_00000  58.9G  1.90T     58.9G  -
hdd/vm-214-disk-1_00000   110K  1.90T      110K  -
hdd/vm-214-disk-2_00000   962M  1.90T      962M  -
hdd/vm-215-disk-1_00000  2.14G  1.90T     2.14G  -
hdd/vm-216-disk-1_00000   882M  1.90T      882M  -
hdd/vm-217-disk-1_00000  2.67G  1.90T     2.67G  -
hdd/vm-218-disk-1_00000   110K  1.90T      110K  -
hdd/vm-218-disk-2_00000  1.39G  1.90T     1.39G  -
hdd/vm-220-disk-1_00000  1.91G  1.90T     1.91G  -
hdd/vm-230-disk-1_00000  33.0G  1.90T     33.0G  -

Here is a list of what exists in my dev/zvol/hdd if that is useful....

root@pve2:/dev/zvol/hdd# ls -la
total 0
drwxr-xr-x 2 root root 780 Jul  3 09:02 .
drwxr-xr-x 3 root root  60 Jun 23 12:52 ..
lrwxrwxrwx 1 root root  11 Jun 29 19:17 vm-104-disk-1_00000 -> ../../zd240
lrwxrwxrwx 1 root root  11 Jun 30 16:04 vm-104-disk-2_00000 -> ../../zd352
lrwxrwxrwx 1 root root  11 Jun 23 12:52 vm-105-disk-1_00000 -> ../../zd144
lrwxrwxrwx 1 root root  13 Jun 23 12:52 vm-105-disk-1_00000-part1 -> ../../zd144p1
lrwxrwxrwx 1 root root  11 Jul  3 09:02 vm-106-disk-1_00000 -> ../../zd560
lrwxrwxrwx 1 root root  10 Jun 29 17:22 vm-108-disk-1_00000 -> ../../zd64
lrwxrwxrwx 1 root root  10 Jun 29 17:32 vm-108-disk-2_00000 -> ../../zd80
lrwxrwxrwx 1 root root  11 Jun 30 18:19 vm-108-disk-3_00000 -> ../../zd368
lrwxrwxrwx 1 root root  11 Jul  1 14:55 vm-109-disk-1_00000 -> ../../zd400
lrwxrwxrwx 1 root root  11 Jun 29 20:39 vm-110-disk-1_00000 -> ../../zd256
lrwxrwxrwx 1 root root   9 Jun 29 17:19 vm-111-disk-1_00000 -> ../../zd0
lrwxrwxrwx 1 root root  10 Jun 29 17:20 vm-111-disk-2_00000 -> ../../zd16
lrwxrwxrwx 1 root root  10 Jun 29 17:21 vm-111-disk-3_00000 -> ../../zd32
lrwxrwxrwx 1 root root  11 Jul  2 14:18 vm-200-disk-1_00000 -> ../../zd480
lrwxrwxrwx 1 root root  11 Jul  1 15:48 vm-201-disk-1_00000 -> ../../zd432
lrwxrwxrwx 1 root root  10 Jun 23 12:52 vm-202-disk-1_00000 -> ../../zd48
lrwxrwxrwx 1 root root  11 Jun 23 13:54 vm-202-disk-2_00000 -> ../../zd176
lrwxrwxrwx 1 root root  11 Jun 30 14:24 vm-203-disk-1_00000 -> ../../zd304
lrwxrwxrwx 1 root root  11 Jun 30 13:01 vm-204-disk-1_00000 -> ../../zd288
lrwxrwxrwx 1 root root  11 Jul  1 16:02 vm-205-disk-1_00000 -> ../../zd448
lrwxrwxrwx 1 root root  11 Jul  2 11:40 vm-205-disk-2_00000 -> ../../zd416
lrwxrwxrwx 1 root root  11 Jul  2 11:41 vm-205-disk-3_00000 -> ../../zd464
lrwxrwxrwx 1 root root  11 Jul  2 14:37 vm-206-disk-1_00000 -> ../../zd528
lrwxrwxrwx 1 root root  11 Jul  2 14:19 vm-208-disk-1_00000 -> ../../zd496
lrwxrwxrwx 1 root root  11 Jul  2 14:30 vm-208-disk-2_00000 -> ../../zd512
lrwxrwxrwx 1 root root  11 Jul  1 07:30 vm-209-disk-1_00000 -> ../../zd384
lrwxrwxrwx 1 root root  11 Jun 30 11:19 vm-210-disk-1_00000 -> ../../zd272
lrwxrwxrwx 1 root root  11 Jun 29 19:10 vm-212-disk-1_00000 -> ../../zd224
lrwxrwxrwx 1 root root  11 Jun 29 19:06 vm-214-disk-1_00000 -> ../../zd192
lrwxrwxrwx 1 root root  11 Jun 29 19:07 vm-214-disk-2_00000 -> ../../zd208
lrwxrwxrwx 1 root root  11 Jul  2 22:45 vm-215-disk-1_00000 -> ../../zd544
lrwxrwxrwx 1 root root  10 Jun 29 18:06 vm-216-disk-1_00000 -> ../../zd96
lrwxrwxrwx 1 root root  11 Jun 23 12:52 vm-217-disk-1_00000 -> ../../zd112
lrwxrwxrwx 1 root root  11 Jun 30 14:30 vm-218-disk-1_00000 -> ../../zd320
lrwxrwxrwx 1 root root  11 Jun 30 14:45 vm-218-disk-2_00000 -> ../../zd336
lrwxrwxrwx 1 root root  11 Jun 29 19:06 vm-220-disk-1_00000 -> ../../zd160
lrwxrwxrwx 1 root root  11 Jun 23 12:52 vm-230-disk-1_00000 -> ../../zd128
root@pve2:/dev/zvol/hdd# 
acidrop commented 3 years ago

@ghernadi Here's what I get when trying to create the EFI disk on a ZFS storage pool...

root@pve1:~# linstor r l -p|grep 107

| vm-107-disk-1 | pve1 | 7006 | Unused | StandAlone(pve2) | Diskless |                     |
| vm-107-disk-1 | pve2 | 7006 | Unused | StandAlone(pve1) | Diskless |                     |
root@pve1:~# linstor v l -p|grep 107

| pve1 | vm-107-disk-1 | drbdpool             |     0 |    1005 | None          |    168 KiB | Unused | Diskless |
| pve2 | vm-107-disk-1 | drbdpool             |     0 |    1005 | None          |    168 KiB | Unused | Diskless |
root@pve1:~# zfs list|grep 107

drbdpool/vm-107-disk-1_00000   104K  60.8G      104K  -

...and this is what I get when creating same EFI disk on a ThinLVM storage pool (successful)...

root@pve1:~# linstor r l -p|grep 107

| vm-107-disk-2 | pve1 | 7009 | Unused | Ok               | UpToDate | 2021-07-06 17:32:23 |
| vm-107-disk-2 | pve3 | 7009 | Unused | Ok               | UpToDate | 2021-07-06 17:32:25 |
root@pve1:~# linstor v l -p|grep 107

| pve1 | vm-107-disk-2 | thinpool01           |     0 |    1008 | /dev/drbd1008 |      4 MiB | Unused | UpToDate |
| pve3 | vm-107-disk-2 | thinpool01           |     0 |    1008 | /dev/drbd1008 |      4 MiB | Unused | UpToDate |
root@pve1:~# lvs |grep 107
  vm-107-disk-2_00000                 ThinVG01 Vwi-aotz--  4.00m ThinPool01                     100.00 
rck commented 3 years ago

looks like zfs and lvm result in a different granularity of the block device that is created. If I do what you did, it also fails for me on ZFS, because

root@pve:~# echo $(( $(blockdev --getsize64 /dev/zd16) / 1024 ))
168 # K

and that was the only ZFS on my system. Looking at the PVE code, it looks like they create a tiny disk, in this case the size of about:

du -hs /usr/share/pve-edk2-firmware/OVMF_VARS.fd
128K    /usr/share/pve-edk2-firmware/OVMF_VARS.fd

and that is too small for a DRBD device. Introducing a minimum of 5M fixed it for me:

diff --git a/LINSTORPlugin.pm b/LINSTORPlugin.pm
index 73a416e..bdcd0df 100644
--- a/LINSTORPlugin.pm
+++ b/LINSTORPlugin.pm
@@ -257,6 +257,9 @@ sub clone_image {
 sub alloc_image {
     my ( $class, $storeid, $scfg, $vmid, $fmt, $name, $size ) = @_;

+    my $min_kib = 5*1024;
+    $size = $min_kib if $size < $min_kib;
+
     # check if it is the controller, which always has exactly "disk-1"
     my $retname = $name;
     if ( !defined($name) ) {

I will think about it a bit more, but I'd guess it will look something like this.

Foxi352 commented 3 years ago

@rck I did modify the .pm file by Hand on the two nodes but still no luck. Now my question: Do i need to reboot the hosts to reload the Plugin for the modifications to become active ?

Danke

acidrop commented 3 years ago

@Foxi352 It seems to have worked for me. You must run 'systemctl reload pvedaemon' on each node for changes to take effect.

root@pve1:~# linstor r l|grep 107
| vm-107-disk-1 | pve1 | 7006 | Unused | Ok    | UpToDate | 2021-07-07 19:26:38 |
| vm-107-disk-1 | pve2 | 7006 | Unused | Ok    | UpToDate | 2021-07-07 19:26:41 |
root@pve1:~# linstor v l|grep 107
| pve1 | vm-107-disk-1 | drbdpool             |     0 |    1005 | /dev/drbd1005 |   5.04 MiB | Unused | UpToDate |
| pve2 | vm-107-disk-1 | drbdpool             |     0 |    1005 | /dev/drbd1005 |   5.04 MiB | Unused | UpToDate |
root@pve1:~# zfs list|grep 107
drbdpool/vm-107-disk-1_00000  5.10M  60.8G     5.10M  -
Foxi352 commented 3 years ago

@acidrop Thanks for the daemon reload tip ! @rck I confirm your patch working, at least for a quick test adding an EFI disk to an existing VM.

I'll try moving a VM with existing EFI disk to DRBD tomorrow morning.

EDIT: I confirm disk migration is now working from local zfs to drbd storage on different server

rck commented 3 years ago

great, thanks for testing, there will be a new release today or tomorrow