stratis-storage / project

A holding place for issues that affect more than one repository in the project
4 stars 0 forks source link

Problems with stratis over drbd #272

Open jvinolas opened 3 years ago

jvinolas commented 3 years ago

We are doing tests with stratis over drbd master/slave. We faced some problems:

Using Debian 10, drbd 9.0.28-1, kernel 5.10, stratisd 2.4.0, stratis-cli 2.3.0.

This is the procedure of what we have done: __ We have two drbd volumes on the same resource over two disks so they are shown as block devices:

nvme0n1
└─drbd1001
nvme1n1
└─drbd1002

We build stratisd and stratis-cli:

# /usr/libexec/stratisd --version
stratis 2.4.0
# stratis --version
2.3.0

We were able to create the pool and filesystem:

stratis pool create isard /dev/drbd1001 /dev/drbd1002
stratis filesystem create isard storage

At this point we were unable to find the device under /stratis (just does not exist) neither /dev/... so we used blkid and then we were able to mount it.

mount UUID="441c0549-976c-4293-b1a0-477be1d2b816" /mnt

After that we unmounted /mnt and tried to move the master drbd to the other node without success as stratisd is handling it:

# drbdsetup secondary storage
storage: State change failed: (-12) Device is held open by someone
additional info from kernel:
/dev/drbd1001 opened by stratisd (pid 2479) at 2021-03-03 12:45:24.272

So we tried stopping the stratisd service without success. Now, after we started stratisd again, no pools/filesystems are shown in stratis pool list. But we are able to mount the filesystem with the UUID as before.

For reference, attached the commands used to build stratis under Debian 10:

build_stratis_debian10.txt

mulkieran commented 3 years ago

@jvinolas You should see links in "/dev/stratis". I see you used our Makefile install target, which should have properly installed the necessary udev-related bits to add the link, including 11-stratisd.rules and stratis_uuids_to_names. stratis_uuids_to_names logs on both success and failure. Can you check your system log for messages? An absence of any messages from the script would indicate that it had not run at all.

jvinolas commented 3 years ago

Can't see any /dev/statis neither /statis. No udev rules in /etc/udev/rules.d The only messages are:

   31.009528] drbd storage: role( Secondary -> Primary )
[  224.185969] systemd-sysv-generator[2476]: SysV service '/etc/init.d/exim4' lacks a native systemd unit file. Automatically generating a unit file for compatibility. Please update package to include a native systemd unit file, in order to make it more safe and robust.
[  224.188505] systemd-sysv-generator[2476]: SysV service '/etc/init.d/openipmi' lacks a native systemd unit file. Automatically generating a unit file for compatibility. Please update package to include a native systemd unit file, in order to make it more safe and robust.
[  233.771369] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[  233.772505] XFS (dm-3): Mounting V5 Filesystem
[  233.775623] XFS (dm-3): Ending clean mount
[  233.775741] xfs filesystem being mounted at /run/stratisd/.mdv-15127785b61d41aaa01263e9bf30198a supports timestamps until 2038 (0x7fffffff)
[  233.776393] XFS (dm-3): Unmounting Filesystem
[  233.862612] device-mapper: thin: 253:4: growing the metadata device from 4096 to 933888 blocks
[  233.914623] device-mapper: thin: 253:4: growing the data device from 768 to 3659040 blocks
[  301.712084] XFS (dm-3): Mounting V5 Filesystem
[  301.717123] XFS (dm-3): Ending clean mount
[  301.717533] xfs filesystem being mounted at /run/stratisd/.mdv-15127785b61d41aaa01263e9bf30198a supports timestamps until 2038 (0x7fffffff)
[  301.719541] XFS (dm-3): Unmounting Filesystem
[  657.950625] XFS (dm-5): Mounting V5 Filesystem
[  657.956091] XFS (dm-5): Ending clean mount
[  657.959387] xfs filesystem being mounted at /mnt supports timestamps until 2038 (0x7fffffff)
[  685.826815] XFS (dm-5): Unmounting Filesystem
[  696.094713] drbd storage: State change failed: Device is held open by someone
[  696.094769] drbd storage: Failed: role( Primary -> Secondary )
[  712.445416] drbd storage: State change failed: Device is held open by someone
[  712.445469] drbd storage: Failed: role( Primary -> Secondary )
[  921.329198] XFS (dm-5): Mounting V5 Filesystem
[  921.333600] XFS (dm-5): Ending clean mount
[  921.336870] xfs filesystem being mounted at /mnt supports timestamps until 2038 (0x7fffffff)
[  927.987435] XFS (dm-5): Unmounting Filesystem
[ 1506.409563] drbd storage: State change failed: Device is held open by someone
[ 1506.409629] drbd storage: Failed: role( Primary -> Secondary )

And lsblk:

nvme0n1                                                 259:0     0   1,8T  0 disk    
└─drbd1001                                              147:1001  0   1,8T  0 disk    
  └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-physical-originsub
                                                        253:0     0   3,5T  0 stratis 
    ├─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-thinmeta
    │                                                   253:1     0   3,6G  0 stratis 
    │ └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-thinpool-pool
    │                                                   253:4     0   3,5T  0 stratis 
    │   └─stratis-1-15127785b61d41aaa01263e9bf30198a-thin-fs-441c0549976c4293b1a0477be1d2b816
    │                                                   253:5     0     1T  0 stratis 
    ├─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-thindata
    │                                                   253:2     0   3,5T  0 stratis 
    │ └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-thinpool-pool
    │                                                   253:4     0   3,5T  0 stratis 
    │   └─stratis-1-15127785b61d41aaa01263e9bf30198a-thin-fs-441c0549976c4293b1a0477be1d2b816
    │                                                   253:5     0     1T  0 stratis 
    └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-mdv
                                                        253:3     0    16M  0 stratis 
nvme1n1                                                 259:1     0   1,8T  0 disk    
└─drbd1002                                              147:1002  0   1,8T  0 disk    
  └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-physical-originsub
                                                        253:0     0   3,5T  0 stratis 
    ├─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-thinmeta
    │                                                   253:1     0   3,6G  0 stratis 
    │ └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-thinpool-pool
    │                                                   253:4     0   3,5T  0 stratis 
    │   └─stratis-1-15127785b61d41aaa01263e9bf30198a-thin-fs-441c0549976c4293b1a0477be1d2b816
    │                                                   253:5     0     1T  0 stratis 
    ├─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-thindata
    │                                                   253:2     0   3,5T  0 stratis 
    │ └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-thinpool-pool
    │                                                   253:4     0   3,5T  0 stratis 
    │   └─stratis-1-15127785b61d41aaa01263e9bf30198a-thin-fs-441c0549976c4293b1a0477be1d2b816
    │                                                   253:5     0     1T  0 stratis 
    └─stratis-1-private-15127785b61d41aaa01263e9bf30198a-flex-mdv
                                                        253:3     0    16M  0 stratis 
mulkieran commented 3 years ago

@jvinolas You must definitely get your udev rules and script into the correct place. I don't know exactly why the install target isn't working for you, but it's worthwhile to make sure that each of the actions in the target are properly completed, and if not, to ensure that they are. If the udev rule isn't triggered, then the links will not be made.

mulkieran commented 3 years ago

@jvinolas Another hint: After you create a pool or a filesystem, try listing to verify success: stratis pool list, stratis filesystem list, and stratis blockdev list, will give you some insight into the state of the pool, etc., after your operations.