LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
955 stars 76 forks source link

Linstor drdb files don't seem to work after an upgrade #199

Closed halkeye closed 2 years ago

halkeye commented 3 years ago

I'm in the process of doing some much needed upgrades to my server, and somehow managed to break linstore/drdb. Its been a while since I rebooted so it could have been a long time ago.

linstor err list gives me a lot of: ┊ 5FC16543-48273-000166 ┊ 2020-11-27 20:53:32 ┊ S|nickfury ┊ StorageException: Need initial DRBD state ┊ ┊ 5FC16543-48273-000167 ┊ 2020-11-27 20:53:32 ┊ S|nickfury ┊ StorageException: Need initial DRBD state ┊

And when I check the status of drdb on my nodes i get

root@mariahill:~# systemctl status drbd.service
● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager.
   Loaded: loaded (/lib/systemd/system/drbd.service; disabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2020-11-27 20:53:49 UTC; 8min ago
  Process: 17906 ExecStart=/lib/drbd/drbd start (code=exited, status=6)
 Main PID: 17906 (code=exited, status=6)

Nov 27 20:53:49 mariahill systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
Nov 27 20:53:49 mariahill drbd[17906]:  * Starting DRBD resources
Nov 27 20:53:49 mariahill drbd[17906]: /var/lib/linstor.d/pvc-0a066cfd-6d2e-4f80-b501-d48468439235.res:5: Parse error: 'protocol | on | disk | net | syncer | startup | handlers | ignore-on | stacked-on-top-of' expected,
Nov 27 20:53:49 mariahill drbd[17906]:         but got 'template-file' (TK 282)
Nov 27 20:53:49 mariahill drbd[17906]:    ...fail!
Nov 27 20:53:49 mariahill systemd[1]: drbd.service: Main process exited, code=exited, status=6/NOTCONFIGURED
Nov 27 20:53:49 mariahill systemd[1]: drbd.service: Failed with result 'exit-code'.
Nov 27 20:53:49 mariahill systemd[1]: Failed to start DRBD -- please disable. Unless you are NOT using a cluster manager..
# This file was generated by linstor(1.10.0), do not edit manually.

resource "pvc-0a066cfd-6d2e-4f80-b501-d48468439235"
{
    template-file "linstor_common.conf";

    options
    {
        quorum off;
    }

    net
    {
        cram-hmac-alg     sha1;
        shared-secret     [SNIP];
    }

    on mariahill
    {
        volume 0
        {
            disk        /dev/vg/pvc-0a066cfd-6d2e-4f80-b501-d48468439235_00000;
            disk
            {
                discard-zeroes-if-aligned yes;
                rs-discard-granularity 65536;
            }
            meta-disk   internal;
            device      minor 1018;
        }
        node-id    0;
    }

    on nickfury
    {
        volume 0
        {
            disk        /dev/drbd/this/is/not/used;
            disk
            {
                discard-zeroes-if-aligned yes;
                rs-discard-granularity 65536;
            }
            meta-disk   internal;
            device      minor 1018;
        }
        node-id    1;
    }

    connection
    {
        host mariahill address ipv4 172.16.10.5:7018;
        host nickfury address ipv4 172.16.10.3:7018;
    }
}

Is there a way to regenerate all the pvc configs?

halkeye commented 3 years ago

ohhh, maybe my drdb is outdated

root@nickfury:~# apt-cache madison drbd-dkms
 drbd-dkms | 9.0.25-1ppa1~bionic1 | http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu bionic/main amd64 Packages
root@nickfury:~# apt-cache madison drbd-utils
drbd-utils | 9.15.1-1ppa1~bionic1 | http://ppa.launchpad.net/linbit/linbit-drbd9-stack/ubuntu bionic/main amd64 Packages
drbd-utils | 8.9.10-2ubuntu0.1 | http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
drbd-utils |   8.9.10-2 | http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages
halkeye commented 3 years ago
root@nickfury:~# dpkg -l drbd-utils drbd-dkms
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                                                      Version                                   Architecture                              Description
+++-=========================================================================-=========================================-=========================================-========================================================================================================================================================
ii  drbd-dkms                                                                 9.0.25-1ppa1~bionic1                      all                                       RAID 1 over TCP/IP for Linux module source
ii  drbd-utils                                                                9.15.1-1ppa1~bionic1                      amd64                                     RAID 1 over TCP/IP for Linux (user utilities)

Those look like the ones in https://launchpad.net/~linbit/+archive/ubuntu/linbit-drbd9-stack

halkeye commented 3 years ago

So yolo'd it

dpkg --purge drbd-utils drbd-dkms
apt install -y drbd-utils drbd-dkms

then rebooted now things are back up. I suspect the dkms script broke at some point in the past?

Feel free to close, but I'll leave it open incase anything useful can still be retrieved

ghernadi commented 3 years ago

Looks like you really simply had a too old utils version, which did not understand the DRBD v9 .res file format.

Since we do have already an internal issue open for adding an additional check, not just for DRBD v9, but also for DRBD-utils >v9, I will also leave this ticket open until we get that internal ticket done