Open kvaps opened 4 years ago
can you attach the dmesg
output of all 4 satellites of the given timeframe?
Sure, I'm going to repeat this today
On Wed, Jun 17, 2020, 07:40 ghernadi notifications@github.com wrote:
can you attach the dmesg output of all 4 satellites of the given timeframe?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/LINBIT/linstor-server/issues/156#issuecomment-645163565, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZUY6JOE7IPHTNJ74ODWNLRXBJM3ANCNFSM4N7XBZ2Q .
Okay I just tried to repeat the test:
Wed Jun 17 09:09:47 UTC 2020 - replicase are created in linstor
Wed Jun 17 09:10:53 UTC 2020 - small drive mounted on linstor-dev-3, staring slow writer
Wed Jun 17 09:11:49 UTC 2020 - starting dd on big drive on linstor-dev-1
Wed Jun 17 09:14:14 UTC 2020 - slow writer paused writing
Wed Jun 17 09:14:47 UTC 2020 - dd finished with error, slow writer continued working
Wed Jun 17 09:16:47 UTC 2020 - I finished slow writer
Wed Jun 17 09:18:45 UTC 2020 - umount small drive from linstor-dev-3
Wed Jun 17 09:19:41 UTC 2020 - delete diskless replica of small drive from linstor-dev-3
Wed Jun 17 09:20:09 UTC 2020 - diskless replica for small drive on linstor-dev-4
Wed Jun 17 09:21:34 UTC 2020 - try mount small drive on linstor-dev-4
Wed Jun 17 09:22:58 UTC 2020 - try mount small drive on linstor-dev-2
linstor-dev-1-dmesg.log linstor-dev-2-dmesg.log linstor-dev-3-dmesg.log linstor-dev-4-dmesg.log
This time was small difference, when storagepool on linstor-dev-1 was overfilled the slow writer stoped working for drive2
until the fast writer (dd on drive
) finished it's work and return the error, after that both drives become to diskless
on linstor-dev-1
, and slow writer contained working:
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-1 ┊ drive1 ┊ thindata ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 4.89 GiB ┊ Unused ┊ Diskless ┊
┊ linstor-dev-1 ┊ drive2 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.50 MiB ┊ Unused ┊ Diskless ┊
┊ linstor-dev-2 ┊ drive2 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.70 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-dev-3 ┊ drive2 ┊ DfltDisklessStorPool ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ ┊ InUse ┊ Diskless ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
After that I also stopped it and unmounted the drive, then successfully removed it:
# linstor r d linstor-dev-3 drive2
INFO:
The given resource will not be deleted but will be taken over as a linstor managed tiebreaker resource.
SUCCESS:
Resource 'drive2' updated on node 'linstor-dev-3'
SUCCESS:
Resource 'drive2' updated on node 'linstor-dev-2'
SUCCESS:
Resource 'drive2' updated on node 'linstor-dev-1'
But creating new replica on linstor-dev-4
still was fail:
# linstor r c linstor-dev-4 drive2 -s DfltDisklessStorPool
WARNING:
Description:
Resource will be automatically flagged as drbd diskless
Cause:
Used storage pool 'DfltDisklessStorPool' is diskless, but resource was not flagged drbd diskless
SUCCESS:
Successfully set property key(s): StorPoolName
INFO:
Tie breaker marked for deletion
SUCCESS:
Description:
New resource 'drive2' on node 'linstor-dev-4' registered.
Details:
Resource 'drive2' on node 'linstor-dev-4' UUID is: cfaa3820-654b-42bb-bf7b-8e0340c60b84
SUCCESS:
Description:
Volume with number '0' on resource 'drive2' on node 'linstor-dev-4' successfully registered
Details:
Volume UUID is: c930e77e-18a3-4fda-97a5-b73c4ffdaff6
SUCCESS:
Created resource 'drive2' on 'linstor-dev-4'
ERROR:
Description:
(Node: 'linstor-dev-3') Shutdown of the DRBD resource 'drive2 failed
Cause:
The external command for stopping the DRBD resource failed
Correction:
- Check whether the required software is installed
- Check whether the application's search path includes the location
of the external software
- Check whether the application has execute permission for the external command
Show reports:
linstor error-reports show 5EE9DCF0-D519A-000000
ERROR:
(Node: 'linstor-dev-2') Failed to adjust DRBD resource drive2
Show reports:
linstor error-reports show 5EE9DCF2-DC80A-000000
SUCCESS:
Added peer(s) 'linstor-dev-4' to resource 'drive2' on 'linstor-dev-1'
5EE9DCF0-D519A-000000.log 5EE9DCF2-DC80A-000000.log
# linstor v l
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-1 ┊ drive1 ┊ thindata ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 4.89 GiB ┊ Unused ┊ Diskless ┊
┊ linstor-dev-1 ┊ drive2 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.50 MiB ┊ Unused ┊ Inconsistent ┊
┊ linstor-dev-2 ┊ drive2 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.91 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-dev-4 ┊ drive2 ┊ DfltDisklessStorPool ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ ┊ Unused ┊ Diskless ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
the mounting it on linstor-dev-4
was also failed:
root@linstor-dev-4:~# mount /dev/drbd1001 /mnt/
damount: /mnt: mount(2) system call failed: No data available.
root@linstor-dev-4:~# drbdadm status
drive2 role:Secondary
disk:Diskless quorum:no
linstor-dev-1 role:Secondary
peer-disk:Diskless
linstor-dev-2 connection:Connecting
but on the linstor-dev-2
it was stuck:
root@linstor-dev-2:~# mount /dev/drbd1001 /mnt/
<stuck>
root@linstor-dev-2:~# drbdadm status
drive2 role:Secondary
disk:UpToDate
linstor-dev-1 role:Secondary
peer-disk:Diskless
linstor-dev-3 role:Secondary
peer-disk:Diskless
After that I left to lunch, when I came back I found that it was mounted in readonly mode:
root@linstor-dev-2:~# mount /dev/drbd1001 /mnt/
mount: /mnt: WARNING: device write-protected, mounted read-only.
but drbdadm status
show that it is secondary:
root@linstor-dev-2:~# drbdadm status
drive2 role:Secondary
disk:UpToDate
linstor-dev-1 role:Secondary
peer-disk:Diskless
linstor-dev-3 role:Secondary
peer-disk:Diskless
The most interesting part is that I was trying to remount it in readwrite mode:
root@linstor-dev-2:~# mount -o remount,rw /mnt
and I was succeful, so I can write something on it, but it is still shown as secondary:
root@linstor-dev-2:~# echo asdasd > /mnt/fffff
root@linstor-dev-2:~# drbdadm status
drive2 role:Secondary
disk:UpToDate
linstor-dev-1 role:Secondary
peer-disk:Diskless
linstor-dev-3 role:Secondary
peer-disk:Disklessroot@linstor-dev-2:~# cat /proc/mounts | grep /mnt
/dev/drbd1001 /mnt ext4 rw,relatime 0 0
thus right now I have a drbd drive mounted as Secondary and I can write on it
root@linstor-dev-1:~# linstor r l
╭─────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊
╞═════════════════════════════════════════════════════════════════════════════════════╡
┊ drive1 ┊ linstor-dev-1 ┊ 7000 ┊ Unused ┊ Ok ┊ Diskless ┊
┊ drive2 ┊ linstor-dev-1 ┊ 7001 ┊ Unused ┊ Ok ┊ Diskless ┊
┊ drive2 ┊ linstor-dev-2 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
┊ drive2 ┊ linstor-dev-4 ┊ 7001 ┊ Unused ┊ Connecting(linstor-dev-2) ┊ Diskless ┊
╰─────────────────────────────────────────────────────────────────────────────────────╯
I also tried to remove big drive, then mount small one on instor-dev-2, but it was mounted the same way in readonly, after that I remounted it to readwrite mode, and created new file on it, then unmounted and disconnected on linstor-dev-1
:
drbdadm disconnect drive2
drbdsetup resource-options --quorum=off drive2
drbdadm primary drive2 --force
after that I mounted checked the data, both files were exising there, so I unmounted and run ajust for it, finaly it become to inconsistent state on linstor-dev-2
.
Then I tried to stop mounted secondary on linstor-dev-2
# drbdadm down drive2
drive2: State change failed: (-10) State change was refused by peer node
additional info from kernel:
Declined by peer linstor-dev-4 (id: 3), see the kernel log there
Command 'drbdsetup down drive2' terminated with exit code 11
dmesg:
[ 8834.391680] drbd drive2: Preparing cluster-wide state change 1166730599 (1->3 496/16)
[ 8836.404644] drbd drive2: Declined by peer linstor-dev-4 (id: 3), see the kernel log there
[ 8836.405551] drbd drive2: Aborting cluster-wide state change 1166730599 (2012ms) rv = -10
# drbdadm disconnect drive2 --force
drive2: Failure: (162) Invalid configuration request
additional info from kernel:
unknown connection
Command 'drbdsetup disconnect drive2 0 --force' terminated with exit code 10
Ok I just found another bug. Finally, I removed all the resources and rebooted all the nodes, then created them again but this time only on two nodes:
(11:48:02) linstor-dev-1 # linstor r c linstor-dev-1 drive1 -s thindata
SUCCESS:
Successfully set property key(s): StorPoolName
SUCCESS:
Description:
New resource 'drive1' on node 'linstor-dev-1' registered.
Details:
Resource 'drive1' on node 'linstor-dev-1' UUID is: 9b03e524-718b-4cb1-a01d-bda581728f06
SUCCESS:
Description:
Volume with number '0' on resource 'drive1' on node 'linstor-dev-1' successfully registered
Details:
Volume UUID is: fb372fe1-cc14-42f2-8a4c-ef84df3d65b8
SUCCESS:
Created resource 'drive1' on 'linstor-dev-1'
SUCCESS:
Description:
Resource 'drive1' on 'linstor-dev-1' ready
Details:
Node(s): 'linstor-dev-1', Resource: 'drive1'
(11:48:06) linstor-dev-1 # linstor r c linstor-dev-1 drive2 -s thindata
SUCCESS:
Successfully set property key(s): StorPoolName
SUCCESS:
Description:
New resource 'drive2' on node 'linstor-dev-1' registered.
Details:
Resource 'drive2' on node 'linstor-dev-1' UUID is: df3089f8-dfa1-4627-a370-5e88b58f6998
SUCCESS:
Description:
Volume with number '0' on resource 'drive2' on node 'linstor-dev-1' successfully registered
Details:
Volume UUID is: 6f6ea7d2-3d67-4840-9b9c-f8e449370a0e
SUCCESS:
Created resource 'drive2' on 'linstor-dev-1'
SUCCESS:
Description:
Resource 'drive2' on 'linstor-dev-1' ready
Details:
Node(s): 'linstor-dev-1', Resource: 'drive2'
(11:48:07) linstor-dev-1 # linstor r c linstor-dev-2 drive2 -s thindata
SUCCESS:
Successfully set property key(s): StorPoolName
INFO:
Tie breaker resource 'drive2' created on linstor-dev-3
INFO:
Resource-definition property 'DrbdOptions/Resource/quorum' updated from 'off' to 'majority' by auto-quorum
INFO:
Resource-definition property 'DrbdOptions/Resource/on-no-quorum' updated from 'off' to 'io-error' by auto-quorum
SUCCESS:
Description:
New resource 'drive2' on node 'linstor-dev-2' registered.
Details:
Resource 'drive2' on node 'linstor-dev-2' UUID is: e9262a6a-6ca7-4655-8f2e-2973d9797d1b
SUCCESS:
Description:
Volume with number '0' on resource 'drive2' on node 'linstor-dev-2' successfully registered
Details:
Volume UUID is: 64aa1c8e-804c-4c62-8187-23d9b58631ab
SUCCESS:
Added peer(s) 'linstor-dev-2' to resource 'drive2' on 'linstor-dev-3'
SUCCESS:
Added peer(s) 'linstor-dev-2' to resource 'drive2' on 'linstor-dev-1'
SUCCESS:
Created resource 'drive2' on 'linstor-dev-2'
SUCCESS:
Description:
Resource 'drive2' on 'linstor-dev-2' ready
Details:
Node(s): 'linstor-dev-2', Resource: 'drive2'
SUCCESS:
Created resource 'drive2' on 'linstor-dev-3'
SUCCESS:
Added peer(s) 'linstor-dev-3' to resource 'drive2' on 'linstor-dev-1'
SUCCESS:
Added peer(s) 'linstor-dev-3' to resource 'drive2' on 'linstor-dev-2'
SUCCESS:
Description:
Resource 'drive2' on 'linstor-dev-3' ready
Details:
Node(s): 'linstor-dev-2', Resource: 'drive2'
then I tried to make fs on linstor-dev-2
:
(11:48:52) linstor-dev-2 # mkfs.ext4 /dev/drbd1001
mke2fs 1.45.5 (07-Jan-2020)
^C
(11:50:24) linstor-dev-2 # mkfs.ext4 /dev/drbd1001
mke2fs 1.45.5 (07-Jan-2020)
/dev/drbd1001: Read-only file system while setting up superblock
linstor-dev-1-dmesg.log linstor-dev-2-dmesg.log linstor-dev-3-dmesg.log
same error?
(11:50:19) linstor-dev-1 # linstor r l -a
╭───────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊
╞═══════════════════════════════════════════════════════════════════╡
┊ drive1 ┊ linstor-dev-1 ┊ 7000 ┊ Unused ┊ Ok ┊ UpToDate ┊
┊ drive2 ┊ linstor-dev-1 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
┊ drive2 ┊ linstor-dev-2 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
┊ drive2 ┊ linstor-dev-3 ┊ 7001 ┊ Unused ┊ Ok ┊ TieBreaker ┊
╰───────────────────────────────────────────────────────────────────╯
Also removing of tiebreaker was failed:
(11:59:08) linstor-dev-1 # linstor r d linstor-dev-3 drive2
INFO:
Disabling auto-tiebreaker on resource-definition 'drive2' as tiebreaker resource was manually deleted
INFO:
Resource-definition property 'DrbdOptions/Resource/quorum' was removed as there are not enough resources for quorum
INFO:
Resource-definition property 'DrbdOptions/Resource/on-no-quorum' was removed as there are not enough resources for quorum
SUCCESS:
Description:
Node: linstor-dev-3, Resource: drive2 marked for deletion.
Details:
Node: linstor-dev-3, Resource: drive2 UUID is: 399876b3-6be7-4ac1-b022-7a9a96b7c38a
SUCCESS:
Notified 'linstor-dev-1' that 'drive2' is being deleted on Node(s): [linstor-dev-3]
ERROR:
Description:
(Node: 'linstor-dev-3') Shutdown of the DRBD resource 'drive2 failed
Cause:
The external command for stopping the DRBD resource failed
Correction:
- Check whether the required software is installed
- Check whether the application's search path includes the location
of the external software
- Check whether the application has execute permission for the external command
Show reports:
linstor error-reports show 5EEA01FE-D519A-000000
ERROR:
(Node: 'linstor-dev-2') Failed to adjust DRBD resource drive2
Show reports:
linstor error-reports show 5EEA01FA-DC80A-000000
5EEA01FE-D519A-000000.log 5EEA01FA-DC80A-000000.log
linstor-dev-1-dmesg.log linstor-dev-2-dmesg.log linstor-dev-3-dmesg.log
also removing procedure were somehow weird:
(12:04:23) linstor-dev-1 # linstor rd d drive2
SUCCESS:
Description:
Resource definition 'drive2' marked for deletion.
Details:
Resource definition 'drive2' UUID is: ceb7c124-daf0-4ab7-9f9d-668997e2369e
SUCCESS:
Notified 'linstor-dev-1' that diskless resources of 'drive2' are being deleted
ERROR:
Description:
(Node: 'linstor-dev-3') Shutdown of the DRBD resource 'drive2 failed
Cause:
The external command for stopping the DRBD resource failed
Correction:
- Check whether the required software is installed
- Check whether the application's search path includes the location
of the external software
- Check whether the application has execute permission for the external command
Show reports:
linstor error-reports show 5EEA01FE-D519A-000005
ERROR:
(Node: 'linstor-dev-2') Failed to adjust DRBD resource drive2
Show reports:
linstor error-reports show 5EEA01FA-DC80A-000005
(12:04:26) linstor-dev-1 # drbdadm status
drive2 role:Secondary
disk:UpToDate
linstor-dev-2 role:Secondary
peer-disk:UpToDate
5EEA01FE-D519A-000005.log 5EEA01FA-DC80A-000005.log
then I rebooted all five nodes, and tried again:
(12:05:00) linstor-dev-1 # drbdadm status
# No currently configured DRBD found.
(12:05:04) linstor-dev-1 # linstor rd l
╭────────────────────────────────────────────────╮
┊ ResourceName ┊ Port ┊ ResourceGroup ┊ State ┊
╞════════════════════════════════════════════════╡
┊ drive2 ┊ 7001 ┊ test ┊ DELETING ┊
╰────────────────────────────────────────────────╯
(12:05:08) linstor-dev-1 # linstor rd d drive2
SUCCESS:
Description:
Resource definition 'drive2' marked for deletion.
Details:
Resource definition 'drive2' UUID is: ceb7c124-daf0-4ab7-9f9d-668997e2369e
WARNING:
Description:
No active connection to satellite 'linstor-dev-3'
Details:
The controller is trying to (re-) establish a connection to the satellite. The controller stored the changes and as soon the satellite is connected, it will receive this update.
SUCCESS:
Notified 'linstor-dev-2' that diskless resources of 'drive2' are being deleted
SUCCESS:
Notified 'linstor-dev-1' that diskless resources of 'drive2' are being deleted
(12:05:12) linstor-dev-1 # linstor r l
╭─────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊
╞═════════════════════════════════════════════════════════════════╡
┊ drive2 ┊ linstor-dev-1 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
┊ drive2 ┊ linstor-dev-2 ┊ 7001 ┊ Unused ┊ Ok ┊ UpToDate ┊
╰─────────────────────────────────────────────────────────────────╯
linstor-dev-1-dmesg.log linstor-dev-2-dmesg.log linstor-dev-3-dmesg.log
but after a short period the drive2 was successfully deleted.
The last bug can be simple reproduced, I have clean linstor with just thinlvm pools created, then I do:
linstor rd c drive3 --resource-group test
linstor rd c drive4 --resource-group test
linstor vd c drive3 10G
linstor vd c drive4 2G
linstor r c linstor-dev-5 drive3 -s thindata # (12:28:28)
linstor r c linstor-dev-5 drive4 -s thindata # (12:28:29)
linstor r c linstor-dev-4 drive4 -s thindata # (12:28:30)
Afterwards you can try use drive on linstor-dev-4 it will be unusable:
(12:29:25) linstor-dev-4 # mkfs.ext4 /dev/drbd1001
mke2fs 1.45.5 (07-Jan-2020)
/dev/drbd1001: Read-only file system while setting up superblock
(12:33:43) linstor-dev-4 # drbdadm status
drive4 role:Secondary
disk:UpToDate
linstor-dev-1 role:Secondary
peer-disk:Diskless
linstor-dev-5 role:Secondary
peer-disk:UpToDate
linstor-dev-4-dmesg.log linstor-dev-5-dmesg.log linstor-dev-1-dmesg.log
should I report it to drbd-user@lists.linbit.com?
Ok another bug:
(12:41:06) linstor-dev-1 # linstor c sp DrbdOptions/auto-add-quorum-tiebreaker False
(12:41:29) linstor-dev-1 # linstor rd c drive3 --resource-group test
(12:41:42) linstor-dev-1 # linstor rd c drive4 --resource-group test
(12:41:42) linstor-dev-1 # linstor vd c drive3 10G
(12:41:43) linstor-dev-1 # linstor vd c drive4 2G
(12:41:48) linstor-dev-1 # linstor r c linstor-dev-5 drive3 -s thindata
(12:41:49) linstor-dev-1 # linstor r c linstor-dev-5 drive4 -s thindata
(12:41:50) linstor-dev-1 # linstor r c linstor-dev-4 drive4 -s thindata
(12:43:30) linstor-dev-4 # mkfs.ext4 /dev/drbd1001
(12:47:05) linstor-dev-4 # mount /dev/drbd1001 /mnt/
(12:47:34) linstor-dev-4 # touch /mnt/test
(12:47:35) linstor-dev-4 # while [ $(du -bs /mnt/test | cut -d$'\t' -f1) -lt 1073741824 ]; do echo $((i++)) >> /mnt/test; done
(12:48:06) linstor-dev-5 # dd if=/dev/zero of=/dev/drbd1000 bs=16k status=progress
# Wed Jun 17 12:50:20 UTC 2020 - io stopped for drive4
# Wed Jun 17 12:51:15 UTC 2020 - dd returned error, io continued for drive4
(12:51:15) linstor-dev-1 # linstor v l
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-5 ┊ drive3 ┊ thindata ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 4.89 GiB ┊ Unused ┊ Diskless ┊
┊ linstor-dev-4 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.70 MiB ┊ InUse ┊ UpToDate ┊
┊ linstor-dev-5 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.50 MiB ┊ Unused ┊ Diskless ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────╯
# Wed Jun 17 12:52:58 UTC 2020 - finished test for drive4
(12:53:25) linstor-dev-1 # linstor rd d drive3
(12:54:19) linstor-dev-4 # touch /mnt/test2
(12:54:22) linstor-dev-4 # while [ $(du -bs /mnt/test2 | cut -d$'\t' -f1) -lt 1073741824 ]; do echo $((i++)) >> /mnt/test2; done
(12:53:57) linstor-dev-1 # linstor v l
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-4 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.91 MiB ┊ InUse ┊ UpToDate ┊
┊ linstor-dev-5 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.50 MiB ┊ Unused ┊ Diskless ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────╯
(12:54:01) linstor-dev-1 # linstor v l
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-4 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.91 MiB ┊ InUse ┊ UpToDate ┊
┊ linstor-dev-5 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 1.67 GiB ┊ Unused ┊ UpToDate ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────╯
(12:54:45) linstor-dev-5 # drbdadm adjust drive4
Marked additional 2055 MB as out-of-sync based on AL.
(12:55:00) linstor-dev-5 # drbdadm status
drive4 role:Secondary
disk:Inconsistent
linstor-dev-4 role:Primary
replication:SyncTarget peer-disk:UpToDate done:29.25
# Wed Jun 17 12:55:45 UTC 2020 - second test stopped
(12:57:16) linstor-dev-4 # umount /mnt
(12:57:21) linstor-dev-4 # fsck.ext4 /dev/drbd1001
e2fsck 1.45.5 (07-Jan-2020)
/dev/drbd1001: clean, 13/131376 files, 26353/525190 blocks
(13:03:16) linstor-dev-1 # linstor v l
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ linstor-dev-4 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 98.91 MiB ┊ Unused ┊ UpToDate ┊
┊ linstor-dev-5 ┊ drive4 ┊ thindata ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 1.67 GiB ┊ Unused ┊ UpToDate ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Do you see that drive4
on linstor-dev-5
containing 1.67 GiB
but on linstor-dev-4
containing just 98.91 MiB
both are UpToDate
?
Hi, to reproduce this issue I created 5 VMs and deployed linstor on them:
versions are:
I've created 5 nodes and 5G thinlvm storage pools on them:
then I created new resource group:
And two drives:
Then located them like this:
on node3 I started writing on it some information (slowly):
on node1 I started filling the big drive by zeroes:
After dd sent all the 10 gigs, expectedly it trows an error:
because 5G pool was completely overfilled:
But slow writer was continue working, only first replica of it's drive become to diskless:
I was let it work for a while, then finished, unmounted and checked drbdadm status:
Then I tried to remove the diskless repica:
And attach it on another node, but something had gonna wrong:
5EE8CBBA-D519A-000000.log 5EE8CBEB-DC80A-000000.log drive2-on-linstor-dev-2.txt drive2-on-linstor-dev-4.txt
After that I also tried to mount small drive on alive diskful node:
but command was stuck, even:
was stuck