LINBIT / drbd

LINBIT DRBD kernel module
https://docs.linbit.com/docs/users-guide-9.0/
GNU General Public License v2.0
587 stars 100 forks source link

before-resync-target and after-resync-target called with seemingly invalid values for DRBD_VOLUME and DRBD_MINOR #94

Open ajschorr opened 4 months ago

ajschorr commented 4 months ago

I set up a test environment on linux RHEL 9 to check compatibility between 9.2.5 and 9.2.10 before upgrading some servers. in my resource file, I have:

                before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
                after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;

And I've got 2 volumes set up:

        volume 0 {
                device minor 0;
                disk "/dev/vg_sys/drbd_main";
                meta-disk internal;
        }
        volume 1 {
                device minor 1;
                disk "/dev/vg_sys/drbd_archive";
                meta-disk internal;
        }

After initializing the setup, I ran "mkfs.xfs" on each of the partitions in question (which are LVM thin pool partitions). I think at this point the primary and secondary hosts lost sync. Then I got some errors logged on the secondary host:

Jun 28 15:02:29 ti128 kernel: drbd test2/0 drbd0 ti138: helper command: /sbin/drbdadm before-resync-target
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220521]: invoked for test2/0 1 (drbd0 1)
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220521]: 0 1 is not a valid number
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220521]: sh-lower-resource only available in stacked mode
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220521]: 0 1 is not a valid number
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220521]: Cannot determine lower level device of resource test2/0 1, sorry.
Jun 28 15:02:29 ti128 kernel: drbd test2/0 drbd0 ti138: helper command: /sbin/drbdadm before-resync-target exit code 0
Jun 28 15:02:29 ti128 kernel: drbd test2/0 drbd0: disk( Outdated -> Inconsistent ) [receive-bitmap]
Jun 28 15:02:29 ti128 kernel: drbd test2/0 drbd0 ti138: repl( WFBitMapT -> SyncTarget ) [receive-bitmap]
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0%
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: helper command: /sbin/drbdadm before-resync-target
Jun 28 15:02:29 ti128 kernel: drbd test2/0 drbd0 ti138: Began resync as SyncTarget (will sync 65920 KB [16480 bits set]).
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220529]: invoked for test2/0 1 (drbd0 1)
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220529]: 0 1 is not a valid number
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220529]: sh-lower-resource only available in stacked mode
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: helper command: /sbin/drbdadm before-resync-target exit code 0
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1: disk( Outdated -> Inconsistent ) [receive-bitmap]
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: repl( WFBitMapT -> SyncTarget ) [receive-bitmap]
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220529]: 0 1 is not a valid number
Jun 28 15:02:29 ti128 snapshot-resync-target-lvm.sh[220529]: Cannot determine lower level device of resource test2/0 1, sorry.
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: Began resync as SyncTarget (will sync 0 KB [0 bits set]).
Jun 28 15:02:29 ti128 kernel: drbd test2/1 drbd1 ti138: Resync done (total 1 sec; paused 0 sec; 0 K/sec)

And so on. To me, it looks like DRBD_VOLUME and DRBD_MINOR are being populated incorrectly.

Regards, Andy