Closed immanuelfodor closed 3 years ago
Just a minor addition: the StorageClass was set up with ext4
filesystem and the nodes are using xfs
but I think that shouldn't be a problem for mount
.
You want to use the top-most device in the device stack configured by Piraeus. By default, these are the /dev/drbd*
devices. You can find out which PVC maps to which DRBD device by running:
$ kubectl exec -it deployment/piraeus-op-cs-controller -- linstor volume list
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ centos-7-k8s-100.test ┊ pvc-3b4a1dac-404c-49fd-9d6c-9738e4834afb ┊ autopool-sda ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 214.30 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-101.test ┊ pvc-3b4a1dac-404c-49fd-9d6c-9738e4834afb ┊ autopool-sda ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 27.95 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-102.test ┊ pvc-3b4a1dac-404c-49fd-9d6c-9738e4834afb ┊ DfltDisklessStorPool ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ ┊ InUse ┊ Diskless ┊
┊ centos-7-k8s-100.test ┊ pvc-5628b44d-7b33-41d2-be23-fd84d494120f ┊ autopool-sda ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 3.99 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-101.test ┊ pvc-5628b44d-7b33-41d2-be23-fd84d494120f ┊ autopool-sda ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 188.51 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-102.test ┊ pvc-5628b44d-7b33-41d2-be23-fd84d494120f ┊ DfltDisklessStorPool ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ ┊ InUse ┊ Diskless ┊
┊ centos-7-k8s-100.test ┊ pvc-66ecc826-e4ec-49e6-875b-bd07574db872 ┊ autopool-sda ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ 3.99 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-101.test ┊ pvc-66ecc826-e4ec-49e6-875b-bd07574db872 ┊ autopool-sda ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ 173.17 MiB ┊ Unused ┊ UpToDate ┊
┊ centos-7-k8s-102.test ┊ pvc-66ecc826-e4ec-49e6-875b-bd07574db872 ┊ DfltDisklessStorPool ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ ┊ Unused ┊ Diskless ┊
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
You can mount the devices on the host:
$ mount /dev/drbd1000 /mnt/
$ ll /mnt/
total 4
-rw-rw-r-- 1 centos centos 4 Dec 22 13:57 bla
If you encounter an error like:
$ mount /dev/drbd1002 /mnt/
mount: /dev/drbd1002 is write-protected, mounting read-only
mount: unknown filesystem type '(null)'
that means no file system was created on the PV. File systems are created on first use, i.e. when the first pod attaches to the PV. So you might need to create a dummy pod (or run mkfs.ext4
manually)
Thank you for the explanation, it really came in time as I have a problem now that night involve "rescuing" the data from the LVs.
I needed to restart the physical host yesterday that is running the cluster. After the reboot, all pods came online that wasn't using storage at all or was using NFS storage but all Piraeus-backed failed. They stuck in ContainerInit or ContainerCreating phase but their k8s event log showed CSINode XXYY does not contain driver linstor.csi.linbit.com
upon describe. The CSI node containers were running fine for about a minute then restarted and repeated this cycle in pair with their respective satellites on each node. After finding https://github.com/NetApp/trident/issues/473#issuecomment-726646604 I tried deleting the csi-node daemonset and redeploying it then restarting all the nodes but it didn't help. I let it at this state for an hour as I haven't applied the remove label of the HA controller (https://github.com/piraeusdatastore/piraeus-ha-controller/issues/3), and thought maybe it just needs time to reassign the PVs from the previously attached node of the pods. However, it stayed in this state, csi node pods have been restarted 30+ times to that time but it was late, so I went to sleep.
This morning, I checked the cluster, and magic happened, all pods with Piraeus storage were running fine except two. CSI nodes were also running fine and no more restarts. The two pods stuck in ContainerCreating contain the following errors (snapshot of this morning, hence the 4-5h, so it started it over the night at some point):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 50m (x25 over 4h38m) kubelet, node1 Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[default-token-962t8 config media[]: timed out waiting for the condition
Warning FailedMount 36m (x24 over 4h57m) kubelet, node1 Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[media default-token-962t8 config[]: timed out waiting for the condition
Warning FailedMount 6m57s (x76 over 4h54m) kubelet, node1 Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[config media default-token-962t8[]: timed out waiting for the condition
Warning FailedMount 63s (x59 over 4h56m) kubelet, node1 MountVolume.SetUp failed for volume "pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
config is a Piraeus, media is an NFS volume.
When I SSH into any of the three nodes, and try to run lvs
, it takes a long time to respond (about 10-15s), it was instantaneous before. The linstor volume list
within satellite pods are not responding at all. linstor volume-definition list
is all OK, linstor physical-storage list
is never responding, too. linstor node list
says one node is offline but all three satellite pods are running fine.
I really don't get what has happened, how it almost-solved it itself over the night and how it became into this troubled state. Do you have any idea?
From your description it sounds like the satellite pods were running into errors. The csi-node container restarts if it does not find a matching satellite pod on the same node, and it sounds like LINSTOR cannot communicate with (one of) the satellites, too.
Perhaps a kubectl rollout restart daemonset/piraeus-ns-node
would help?
That didn't help too much, one satellite is still considered offline by linstor while it's pod is running fine. The two consumer pods stuck in ContainerCreating are running on the same node as I can see, and now they have this event log after the rollout and killing them to take up any change:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m37s default-scheduler Successfully assigned media/jellyfin-6fb9d8c47b-tcmgp to node3
Warning FailedAttachVolume 85s (x11 over 7m37s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e" : rpc error: code = NotFound desc = ControllerPublishVolume failed for pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e on node node3: node OFFLINE
Warning FailedMount 66s (x3 over 5m34s) kubelet, node3 Unable to attach or mount volumes: unmounted volumes=[config], unattached volumes=[config media default-token-962t8]: timed out waiting for the condition
The only difference I can see in the satellite logs that node3 doesn't output any log starting with [MainWorkerPool-1]
.
The linstor commands are still not responding to volume or physical storage listing on any satellite.
I also tried to mount the DRBD volumes based on the first comment: https://github.com/piraeusdatastore/piraeus-operator/issues/142#issuecomment-753835974
The linstor commands surprisingly work from deployment/piraeus-op-cs-controller
.
$ kubectl exec -it deployment/piraeus-op-cs-controller -- linstor volume list
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName ┊ Allocated ┊ InUse ┊ State ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ node1 ┊ pvc-07614359-5f50-4953-b699-97f98db11553 ┊ lvm-thin ┊ 0 ┊ 1002 ┊ /dev/drbd1002 ┊ 30.88 MiB ┊ InUse ┊ UpToDate ┊
┊ node2 ┊ pvc-07614359-5f50-4953-b699-97f98db11553 ┊ lvm-thin ┊ 0 ┊ 1002 ┊ /dev/drbd1002 ┊ 52 MiB ┊ Unused ┊ UpToDate ┊
┊ node3 ┊ pvc-07614359-5f50-4953-b699-97f98db11553 ┊ lvm-thin ┊ 0 ┊ 1002 ┊ /dev/drbd1002 ┊ 52 MiB ┊ ┊ Unknown ┊
┊ node1 ┊ pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e ┊ lvm-thin ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 299.15 MiB ┊ Unused ┊ Outdated ┊
┊ node2 ┊ pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e ┊ lvm-thin ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 1.00 GiB ┊ Unused ┊ Outdated ┊
┊ node3 ┊ pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e ┊ lvm-thin ┊ 0 ┊ 1001 ┊ /dev/drbd1001 ┊ 1.00 GiB ┊ ┊ Unknown ┊
┊ node1 ┊ pvc-5a9ca974-743d-4593-a9be-f90337505bc0 ┊ lvm-thin ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ 1.94 MiB ┊ InUse ┊ UpToDate ┊
┊ node2 ┊ pvc-5a9ca974-743d-4593-a9be-f90337505bc0 ┊ lvm-thin ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ 52 MiB ┊ Unused ┊ UpToDate ┊
┊ node3 ┊ pvc-5a9ca974-743d-4593-a9be-f90337505bc0 ┊ lvm-thin ┊ 0 ┊ 1003 ┊ /dev/drbd1003 ┊ 52 MiB ┊ ┊ Unknown ┊
┊ node1 ┊ pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1 ┊ lvm-thin ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 47.58 MiB ┊ Unused ┊ Outdated ┊
┊ node2 ┊ pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1 ┊ lvm-thin ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 304 MiB ┊ Unused ┊ Outdated ┊
┊ node3 ┊ pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1 ┊ lvm-thin ┊ 0 ┊ 1000 ┊ /dev/drbd1000 ┊ 304 MiB ┊ ┊ Unknown ┊
┊ node1 ┊ pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad ┊ lvm-thin ┊ 0 ┊ 1004 ┊ /dev/drbd1004 ┊ 4.50 MiB ┊ InUse ┊ UpToDate ┊
┊ node2 ┊ pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad ┊ lvm-thin ┊ 0 ┊ 1004 ┊ /dev/drbd1004 ┊ 92 MiB ┊ Unused ┊ UpToDate ┊
┊ node3 ┊ pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad ┊ lvm-thin ┊ 0 ┊ 1004 ┊ /dev/drbd1004 ┊ 92 MiB ┊ ┊ Unknown ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
$ mkdir -p /mnt/drbd
$ ls -la /dev/drbd100*
brw-rw----. 1 root disk 147, 1000 Jan 3 19:06 /dev/drbd1000
brw-rw----. 1 root disk 147, 1001 Jan 3 19:06 /dev/drbd1001
brw-rw----. 1 root disk 147, 1002 Jan 3 19:06 /dev/drbd1002
brw-rw----. 1 root disk 147, 1003 Jan 3 19:06 /dev/drbd1003
brw-rw----. 1 root disk 147, 1004 Jan 3 19:06 /dev/drbd1004
$ mount /dev/drbd1000 /mnt/drbd
mount: /mnt/drbd: mount(2) system call failed: Wrong medium type.
And I get the same error on every node for every DRBD device.
The other two satellites have also died to this morning :(
Both contain the following error logs:
linstor-satellite Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-7"
linstor-satellite Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "DrbdEventService"
linstor-satellite Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "NetComService"
According to the above screenshot, they are not using even half of their memory limit (the recommended values from the extras folder).
The node3 satellite is still ready although it's not outputting any MainWorkerPool-1 logs.
The CS controller gives unknown statuses to all linstore volume list
and offline for all node list
entries.
Although I'm on the brink of starting the whole Piraeus deployment over (or looking for a more stable alternative that don't break upon a host restart :( ), I still can't mount the volumes to back up the data stuck in the LVs via the DRBD devices. Would it be lost forever if I delete the whole Piraeus namespace and remove DRBD from the nodes? If Piraeus and DRBD is removed, could I access the LVs on node1 (as the last healthy node) for direct mount somehow?
Oh my, this doesn't look good.
node1:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
pvc-07614359-5f50-4953-b699-97f98db11553_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 59.38
pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 29.10
pvc-5a9ca974-743d-4593-a9be-f90337505bc0_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 3.73
pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 15.65
pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 4.89
thinpool linstor_thinpool twi-aotz-- 7.98g 4.70 13.18
node2:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
pvc-07614359-5f50-4953-b699-97f98db11553_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 89.66
pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 28.55
pvc-5a9ca974-743d-4593-a9be-f90337505bc0_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 78.49
pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.23
pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 84.92
thinpool linstor_thinpool twi-aotz-- 7.98g 6.04 13.82
node3:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
pvc-07614359-5f50-4953-b699-97f98db11553_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 10.70
pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 29.34
pvc-5a9ca974-743d-4593-a9be-f90337505bc0_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 70.79
pvc-9e920c35-8a7b-4022-8597-c8b34dfa4dc1_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 10.86
pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 78.46
thinpool linstor_thinpool twi-aotz-- 7.98g 5.50 13.62
The PVC consumer pods on node1 are still accessing the data somehow, I will try to back it up from there. Also try to cordon node2 and 3, then kill the two consumer pods stuck in ContainerCreating on node3 to schedule them to node1, maybe they can access the data for a rescue, too.
Rescheduling to node1 didn't help, still can't start those two pods but I could get the data from the 3 working pods, they had the device bound:
node1:~# mount | grep drbd
/dev/drbd1002 on /var/lib/kubelet/pods/9cf051dd-c07b-4bbb-abf1-0d4cd78512a5/volumes/kubernetes.io~csi/pvc-07614359-5f50-4953-b699-97f98db11553/mount type ext4 (rw,noatime,seclabel,discard,stripe=64)
/dev/drbd1004 on /var/lib/kubelet/pods/c8006073-a177-4225-b8af-a499aa13d8fa/volumes/kubernetes.io~csi/pvc-d8709f6e-a078-4091-a445-3c0f7926b3ad/mount type ext4 (rw,noatime,seclabel,discard,stripe=64)
/dev/drbd1003 on /var/lib/kubelet/pods/6d487872-de84-475e-9de4-4b8aeaec322d/volumes/kubernetes.io~csi/pvc-5a9ca974-743d-4593-a9be-f90337505bc0/mount type ext4 (rw,noatime,seclabel,discard,stripe=64)
I could easily cp the data out from them. As a temporary solution, applied a hostPath volume to those deployments with the backup folder, and fixed the deployment to node1 that has the folder to continue working.
Now I just need the data from the 1000 and 1001 devices. Sadly, these are not mounted by the kubelet on the other nodes, and I can't mount them with the same options: mount -t ext4 -o rw,noatime,seclabel,discard,stripe=64 /dev/drbd1000 /mnt/drbd/
Okay, finally I could do it. I deleted all Piraeus deployments, statefulsets and daemonsets to clean up the namespace from every pod that is running but probably keep the underlying LVs. Also dnf remove drbd90-utils kmod-drbd90
on all nodes, then restart them. After the reboot, I could manually mount the volumes without a problem with a simple mount /dev/linstor_thinpool/pvc-3d1a3ebb-7a61-4307-b253-1ec13d5c473e_00000 /mnt/drbd/
. Copied the data from the volume, then mounted the other one, and it's done. Now all 5 consumer pods are running with hostPath PVs fine, and the data is in good shape, no loss or whatsoever.
I'm closing this ticket as it was solved for me, and I hope these notes will be useful for others in the future. Now, I'm back at using hostPath volumes which I wanted to replace with Piraeus in the first place, and although it looks good and easy to use when everything is working fine, I don't know if I can trust DRBD at the bottom layer. I'll think about it, maybe settle with Ceph/rook or Gluster, I don't know. I really wanted to love this project, the support is responsive here in Github and I wish you luck.
Also, my server is relieved from a huge load and IO delay after the uninstall of these things :D
Sorry to hear you struggle with Piraeus and sorry to see you go :/
Thank you for providing detailed instructions on how you solved/worked around the issues!
On the off chance you give it a try again: Since you used dnf remove kmod-drbd90
, there is a chance this is related to your issues. In #134 a user struggled because DRBD was installed via the package manager, which used the incorrect parameter when loading the kernel module. This was noticed only after a node reboot, so it seems similar to your problems.
Oh, I'd really love to stay! :)
So you say that maybe the install instructions led to this over the days? https://github.com/piraeusdatastore/piraeus-operator#deployment-with-helm-v3-chart
Prepare the hosts for DRBD deployment. There are several options:
- Install DRBD directly on the hosts as documented.
- Install the appropriate kernel headers package for your distribution and choose the appropriate kernel module injector. Then the operator will compile and load the required modules.
Here are my related Ansible steps:
- name: Install Epelrepo Community
package:
name:
- https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
state: latest
- name: Install packages
package:
name:
- drbd90-utils
- kmod-drbd90
# - drbd-bash-completion # pulls in system repo drbd-utils which conflicts with drbd90-utils, and so mixes up install sources
- kernel-devel # soruce files
- kernel-headers # header files
- cryptsetup # luks support - this won't be enabled in Piraeus, don't know why
state: latest
- name: Exclude kernel packages from update as a temporary fix for DRBD
blockinfile:
dest: /etc/yum.conf
marker: '# {mark} ANSIBLE MANAGED BLOCK - exclude kernel packages from update'
block: |
# Temporary fix for Piraeus->Linstor->DRBD kernel support to stay on v240
# @see: https://github.com/piraeusdatastore/piraeus-operator/issues/137
exclude=kernel*
insertafter: '\[main\]'
state: present
I think it installed the proper versions as I didn't find the ones mentioned in https://www.linbit.com/drbd-user-guide/drbd-guide-9_0-en/#p-build-install-configure
There are several options
So I shouldn't install it on the host, only one option is good and both are wrong together? But if it was not running, then how to mess these things up? The DRBD service was not enabled and not running on the host.
Well, that is embarrassing :sweat_smile:
Looks like the instructions are incomplete for the host set-up. Basically the problem is the missing usermode_helper=disabled
parameter. On first install it will most likely work, as the module is only loaded when the injector image runs, and that takes care of setting this option. On restart, I guess the module gets loaded sooner, now without this parameter set.
Although, now looking back, it does not seem to be your exact issue, as in your case it looks like DRBD (and LINSTOR) refuses to start on node3. Just something to keep in mind, I'll get around to improving the documentation soon.
Okay, so basically I should try it again with the following host setup:
- name: Install Epelrepo Community
package:
name:
- https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
state: latest
- name: Install packages
package:
name:
- - drbd90-utils
- - kmod-drbd90
# - drbd-bash-completion # pulls in system repo drbd-utils which conflicts with drbd90-utils, and so mixes up install sources
- kernel-devel # soruce files
- kernel-headers # header files
- cryptsetup # luks support - this won't be enabled in Piraeus, don't know why
state: latest
- name: Exclude kernel packages from update as a temporary fix for DRBD
blockinfile:
dest: /etc/yum.conf
marker: '# {mark} ANSIBLE MANAGED BLOCK - exclude kernel packages from update'
block: |
# Temporary fix for Piraeus->Linstor->DRBD kernel support to stay on v240
# @see: https://github.com/piraeusdatastore/piraeus-operator/issues/137
exclude=kernel*
insertafter: '\[main\]'
state: present
As the host won't have DRBD, the usermode_helper
flag wouldn't be a problem. The kernel source and headers are still needed based on the referenced issue (#134).
Deleted everything from the Piraeus namespace, CRDs, etcd hostpath data, everything, then redeployed the Helm charts but now without the two host packages.
Recreated the five Piraeus PVs through the storage class, but not assigned to containers, so mount gave me this error first, just as you predicted:
$ mount /dev/drbd1001 /mnt/drbd
mount: /mnt/drbd: wrong fs type, bad option, bad superblock on /dev/drbd1001, missing codepage or helper program, or other error.
Then formatted the device without any params (I don't know if Piraeus applies any custom options):
$ mkfs.ext4 /dev/drbd1001
mke2fs 1.45.6 (20-Mar-2020)
Discarding device blocks: done
Creating filesystem with 263102 4k blocks and 65808 inodes
Filesystem UUID: 593fb666-fa83-4424-ab9f-0c5a55b14972
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
Then the mount on node1 went without a problem, and I could copy the data into it:
$ mount /dev/drbd1000 /mnt/drbd
$ cp -a /srv/hostpath-pv/radarr/config/* /mnt/drbd/
$ umount /mnt/drbd
I was a bit worried that the LV Data% was not the same at first, but then I also rebooted all nodes to try if anything goes wrong, and they got to the same %, asterisk marks the PV that I copied into:
node1:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 6.13
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 0.83 11.43
node2:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.06
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 1.02 11.52
node3:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.06
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 1.02 11.52
# after all nodes reboot:
node1:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.06
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 1.02 11.52
node2:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.06
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 1.02 11.52
node3:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
* pvc-051cfd78-022a-42e3-92be-3066b3ab59ce_00000 linstor_thinpool Vwi-aotz-- 304.00m thinpool 11.06
pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321_00000 linstor_thinpool Vwi-aotz-- 92.00m thinpool 0.07
pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6_00000 linstor_thinpool Vwi-aotz-- 1.00g thinpool 4.80
pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8_00000 linstor_thinpool Vwi-aotz-- 52.00m thinpool 0.12
thinpool linstor_thinpool twi-aotz-- 7.98g 1.02 11.52
DRBD seems to do the replication fine, and the host IO delay hasn't moved a bit, that's great in light of last time's 10-100x increase.
How the percentages can differ at first if blocks are replicated as 1-1 copy of each other? What's more interesting is that I copied the data into the LV on node1, and it showed less data % than what the replicas showed at first. Then it went up just as the replicas. Maybe the mount or the ext4 journal is in play here? When it got unmounted, some data was dumped from memory on node1 that was already stored on disk on the others?
I'm interested in the percentages as I'd assume they are the same at all times, only differs until replicated but that should happen almost instantenously. I had quite large skew in percentages at https://github.com/piraeusdatastore/piraeus-operator/issues/142#issuecomment-755135973 when the faulty operation was ongoing.
I also use the default replication method, I haven't changed that, so it should be C (sync) as I remember the docs.
Oh damn, it's starting again. Everything seems to be fine with the pods, except HA controller that is restarting as always but it's not in use with labels.
However, ns-node on node2 (this time it's not node3) logs showing no MainWorkerPool-1
logs:
kernel-module-injector DRBD module is already loaded
kernel-module-injector DRBD version loaded:
kernel-module-injector version: 9.0.26-1 (api:2/proto:86-118)
kernel-module-injector GIT-hash: 8e0c552326815d9d2bfd1cfd93b23f5692d7109c build by @node2, 2021-01-07 07:57:00
kernel-module-injector Transports (api:16): tcp (9.0.26-1)
kernel-module-injector stream closed
linstor-satellite time="2021-01-07T08:22:59Z" level=info msg="running k8s-await-election" version=refs/tags/v0.2.0
linstor-satellite time="2021-01-07T08:22:59Z" level=info msg="not running with leader election"
linstor-satellite time="2021-01-07T08:22:59Z" level=info msg="starting command '/usr/bin/piraeus-entry.sh' with arguments: '[startSatellite]'"
linstor-satellite LINSTOR, Module Satellite
linstor-satellite Version: 1.11.0 (3367e32d0fa92515efe61f6963767700a8701d98)
linstor-satellite Build time: 2020-12-18T08:40:35+00:00
linstor-satellite Java Version: 11
linstor-satellite Java VM: Debian, Version 11.0.9.1+1-post-Debian-1deb10u2
linstor-satellite Operating system: Linux, Version 4.18.0-240.1.1.el8_3.x86_64
linstor-satellite Environment: amd64, 1 processors, 247 MiB memory reserved for allocations
linstor-satellite System components initialization in progress
linstor-satellite 08:23:00.329 [main] INFO LINSTOR/Satellite - SYSTEM - ErrorReporter DB first time init.
linstor-satellite 08:23:00.331 [main] INFO LINSTOR/Satellite - SYSTEM - Log directory set to: '/var/log/linstor-satellite'
linstor-satellite 08:23:00.358 [main] WARN io.sentry.dsn.Dsn - *** Couldn't find a suitable DSN, Sentry operations will do nothing! See documentation: https://docs.sentry.io/clients/java/ ***
linstor-satellite 08:23:00.365 [Main] INFO LINSTOR/Satellite - SYSTEM - Loading API classes started.
linstor-satellite 08:23:00.739 [Main] INFO LINSTOR/Satellite - SYSTEM - API classes loading finished: 372ms
linstor-satellite 08:23:00.739 [Main] INFO LINSTOR/Satellite - SYSTEM - Dependency injection started.
linstor-satellite WARNING: An illegal reflective access operation has occurred
linstor-satellite WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/usr/share/linstor-server/lib/guice-4.2.3.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],in
linstor-satellite WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
linstor-satellite WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
linstor-satellite WARNING: All illegal access operations will be denied in a future release
linstor-satellite 08:23:01.859 [Main] INFO LINSTOR/Satellite - SYSTEM - Dependency injection finished: 1120ms
linstor-satellite 08:23:02.647 [Main] INFO LINSTOR/Satellite - SYSTEM - Removing all res files from /var/lib/linstor.d
linstor-satellite 08:23:02.648 [Main] INFO LINSTOR/Satellite - SYSTEM - Initializing main network communications service
linstor-satellite 08:23:02.649 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'TimerEventService' of type TimerEventService
linstor-satellite 08:23:02.649 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'FileEventService' of type FileEventService
linstor-satellite 08:23:02.649 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DrbdEventService-1' of type DrbdEventService
linstor-satellite 08:23:02.650 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DrbdEventPublisher-1' of type DrbdEventPublisher
linstor-satellite 08:23:02.650 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'SnapshotShippingService' of type SnapshotShippingService
linstor-satellite 08:23:02.650 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DeviceManager' of type DeviceManager
linstor-satellite 08:23:02.665 [Main] WARN LINSTOR/Satellite - SYSTEM - NetComService: Connector NetComService: Binding the socket to the IPv6 anylocal address failed, attempting fallback to IPv4
linstor-satellite 08:23:02.666 [Main] INFO LINSTOR/Satellite - SYSTEM - NetComService started on port /0:0:0:0:0:0:0:0:3366
In comparison, here are the logs of node1:
kernel-module-injector DRBD module is already loaded
kernel-module-injector DRBD version loaded:
kernel-module-injector version: 9.0.26-1 (api:2/proto:86-118)
kernel-module-injector GIT-hash: 8e0c552326815d9d2bfd1cfd93b23f5692d7109c build by @node1, 2021-01-07 07:57:03
kernel-module-injector Transports (api:16): tcp (9.0.26-1)
kernel-module-injector stream closed
linstor-satellite time="2021-01-07T08:22:37Z" level=info msg="running k8s-await-election" version=refs/tags/v0.2.0
linstor-satellite time="2021-01-07T08:22:37Z" level=info msg="not running with leader election"
linstor-satellite time="2021-01-07T08:22:37Z" level=info msg="starting command '/usr/bin/piraeus-entry.sh' with arguments: '[startSatellite]'"
linstor-satellite LINSTOR, Module Satellite
linstor-satellite Version: 1.11.0 (3367e32d0fa92515efe61f6963767700a8701d98)
linstor-satellite Build time: 2020-12-18T08:40:35+00:00
linstor-satellite Java Version: 11
linstor-satellite Java VM: Debian, Version 11.0.9.1+1-post-Debian-1deb10u2
linstor-satellite Operating system: Linux, Version 4.18.0-240.1.1.el8_3.x86_64
linstor-satellite Environment: amd64, 1 processors, 247 MiB memory reserved for allocations
linstor-satellite System components initialization in progress
linstor-satellite 08:22:39.097 [main] INFO LINSTOR/Satellite - SYSTEM - ErrorReporter DB first time init.
linstor-satellite 08:22:39.099 [main] INFO LINSTOR/Satellite - SYSTEM - Log directory set to: '/var/log/linstor-satellite'
linstor-satellite 08:22:39.154 [main] WARN io.sentry.dsn.Dsn - *** Couldn't find a suitable DSN, Sentry operations will do nothing! See documentation: https://docs.sentry.io/clients/java/ ***
linstor-satellite 08:22:39.161 [Main] INFO LINSTOR/Satellite - SYSTEM - Loading API classes started.
linstor-satellite 08:22:39.561 [Main] INFO LINSTOR/Satellite - SYSTEM - API classes loading finished: 399ms
linstor-satellite 08:22:39.561 [Main] INFO LINSTOR/Satellite - SYSTEM - Dependency injection started.
linstor-satellite WARNING: An illegal reflective access operation has occurred
linstor-satellite WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/usr/share/linstor-server/lib/guice-4.2.3.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],in
linstor-satellite WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
linstor-satellite WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
linstor-satellite WARNING: All illegal access operations will be denied in a future release
linstor-satellite 08:22:40.781 [Main] INFO LINSTOR/Satellite - SYSTEM - Dependency injection finished: 1220ms
linstor-satellite 08:22:41.661 [Main] INFO LINSTOR/Satellite - SYSTEM - Removing all res files from /var/lib/linstor.d
linstor-satellite 08:22:41.662 [Main] INFO LINSTOR/Satellite - SYSTEM - Initializing main network communications service
linstor-satellite 08:22:41.662 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'TimerEventService' of type TimerEventService
linstor-satellite 08:22:41.663 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'FileEventService' of type FileEventService
linstor-satellite 08:22:41.663 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DrbdEventService-1' of type DrbdEventService
linstor-satellite 08:22:41.665 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DrbdEventPublisher-1' of type DrbdEventPublisher
linstor-satellite 08:22:41.665 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'SnapshotShippingService' of type SnapshotShippingService
linstor-satellite 08:22:41.665 [Main] INFO LINSTOR/Satellite - SYSTEM - Starting service instance 'DeviceManager' of type DeviceManager
linstor-satellite 08:22:41.672 [Main] WARN LINSTOR/Satellite - SYSTEM - NetComService: Connector NetComService: Binding the socket to the IPv6 anylocal address failed, attempting fallback to IPv4
linstor-satellite 08:22:41.673 [Main] INFO LINSTOR/Satellite - SYSTEM - NetComService started on port /0:0:0:0:0:0:0:0:3366
linstor-satellite 08:22:49.598 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Controller connected and authenticated (192.168.1.101:51991)
linstor-satellite 08:22:50.783 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node1' created.
linstor-satellite 08:22:50.785 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node2' created.
linstor-satellite 08:22:50.785 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node3' created.
linstor-satellite 08:22:50.789 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'DfltDisklessStorPool' created.
linstor-satellite 08:22:50.790 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'lvm-thin' created.
linstor-satellite 08:22:51.026 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node1'.
linstor-satellite 08:22:51.026 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node2'.
linstor-satellite 08:22:51.026 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node3'.
linstor-satellite 08:22:51.029 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node1'.
linstor-satellite 08:22:51.029 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node2'.
linstor-satellite 08:22:51.029 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node3'.
linstor-satellite 08:22:51.032 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node1'.
linstor-satellite 08:22:51.032 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node2'.
linstor-satellite 08:22:51.032 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node3'.
linstor-satellite 08:22:51.034 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node1'.
linstor-satellite 08:22:51.034 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node2'.
linstor-satellite 08:22:51.034 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node3'.
linstor-satellite 08:22:51.036 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node1'.
linstor-satellite 08:22:51.036 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node2'.
linstor-satellite 08:22:51.036 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node3'.
linstor-satellite 08:22:51.736 [DeviceManager] WARN LINSTOR/Satellite - SYSTEM - Not calling 'systemd-notify' as NOTIFY_SOCKET is null
linstor-satellite 08:24:44.649 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - SpaceTracking: Satellite aggregate capacity is 0 kiB, no errors
linstor-satellite 00:24:44.823 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - SpaceTracking: Satellite aggregate capacity is 0 kiB, no errors
I did not do anything to the nodes or Piraeus or k8s from my last comment (22h ago). The last thing was to copy data to one volume and reboot all nodes as above. Volumes are still not attached to any consumer pods. The volume list was all green yesterday before the reboot, now node2 is in unknown state:
$ k exec -n piraeus deployment.apps/piraeus-op-cs-controller -- linstor volume list
+----------------------------------------------------------------------------------------------------------------------------------+
| Node | Resource | StoragePool | VolNr | MinorNr | DeviceName | Allocated | InUse | State |
|==================================================================================================================================|
| node1 | pvc-051cfd78-022a-42e3-92be-3066b3ab59ce | lvm-thin | 0 | 1000 | /dev/drbd1000 | 33.62 MiB | Unused | UpToDate |
| node2 | pvc-051cfd78-022a-42e3-92be-3066b3ab59ce | lvm-thin | 0 | 1000 | None | | | Unknown |
| node3 | pvc-051cfd78-022a-42e3-92be-3066b3ab59ce | lvm-thin | 0 | 1000 | /dev/drbd1000 | 33.62 MiB | Unused | UpToDate |
| node1 | pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321 | lvm-thin | 0 | 1003 | /dev/drbd1003 | 65 KiB | Unused | UpToDate |
| node2 | pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321 | lvm-thin | 0 | 1003 | None | | | Unknown |
| node3 | pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321 | lvm-thin | 0 | 1003 | /dev/drbd1003 | 65 KiB | Unused | UpToDate |
| node1 | pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6 | lvm-thin | 0 | 1001 | /dev/drbd1001 | 49.34 MiB | Unused | UpToDate |
| node2 | pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6 | lvm-thin | 0 | 1001 | None | | | Unknown |
| node3 | pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6 | lvm-thin | 0 | 1001 | /dev/drbd1001 | 49.34 MiB | Unused | UpToDate |
| node1 | pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c | lvm-thin | 0 | 1004 | /dev/drbd1004 | 63 KiB | Unused | UpToDate |
| node2 | pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c | lvm-thin | 0 | 1004 | None | | | Unknown |
| node3 | pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c | lvm-thin | 0 | 1004 | /dev/drbd1004 | 63 KiB | Unused | UpToDate |
| node1 | pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8 | lvm-thin | 0 | 1002 | /dev/drbd1002 | 63 KiB | Unused | UpToDate |
| node2 | pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8 | lvm-thin | 0 | 1002 | None | | | Unknown |
| node3 | pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8 | lvm-thin | 0 | 1002 | /dev/drbd1002 | 63 KiB | Unused | UpToDate |
+----------------------------------------------------------------------------------------------------------------------------------+
What more could I do to make this stable?
This may be a dumb question, but have you checked that the IP address of node 2/3 stays the same after a reboot?
Yes, it's the same, and they have static address configured and no DHCP available.
To be honest, I'm not sure what's happening here. It looks like the controller fails to connect to the node, which then means that the node does not bring up the expected DRBD resources. Can you first check the error reports linstor err l
+ linstor err show
for anything related to connecting to node2?
After that, you could try running linstor node reconnect node2
.
There are no errors, sadly :(
$ k exec -n piraeus deployment.apps/piraeus-op-cs-controller -- linstor err l
+----------------------------------+
| Id | Datetime | Node | Exception |
|==================================|
+----------------------------------+
The state of volumes are still Unknown today and reconnect gave me this, so I ran the other suggested command:
$ k exec -n piraeus deployment.apps/piraeus-op-cs-controller -- linstor node reconnect node2
WARNING:
Nodes [] are evicted and will not be reconnected. Use node restore <node-name> to reconnect.
$ k exec -n piraeus deployment.apps/piraeus-op-cs-controller -- linstor node restore node2
SUCCESS:
Successfully restored node node2
Now the volume list is all green. Node2's MainWorkerPool-1
logs also appeared after yesterday's last line:
linstor-satellite 08:23:02.666 [Main] INFO LINSTOR/Satellite - SYSTEM - NetComService started on port /0:0:0:0:0:0:0:0:3366
linstor-satellite 06:45:36.265 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Controller connected and authenticated (192.168.1.103:55463)
linstor-satellite 06:45:36.620 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node1' created.
linstor-satellite 06:45:36.621 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node2' created.
linstor-satellite 06:45:36.622 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node3' created.
linstor-satellite 06:45:36.630 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'DfltDisklessStorPool' created.
linstor-satellite 06:45:36.632 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'lvm-thin' created.
linstor-satellite 06:45:36.865 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node1'.
linstor-satellite 06:45:36.865 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node2'.
linstor-satellite 06:45:36.865 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node3'.
linstor-satellite 06:45:36.868 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node1'.
linstor-satellite 06:45:36.869 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node2'.
linstor-satellite 06:45:36.869 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node3'.
linstor-satellite 06:45:36.870 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node1'.
linstor-satellite 06:45:36.871 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node2'.
linstor-satellite 06:45:36.871 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node3'.
linstor-satellite 06:45:36.872 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node1'.
linstor-satellite 06:45:36.872 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node2'.
linstor-satellite 06:45:36.872 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node3'.
linstor-satellite 06:45:36.874 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node1'.
linstor-satellite 06:45:36.874 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node2'.
linstor-satellite 06:45:36.874 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node3'.
linstor-satellite 06:45:37.815 [DeviceManager] WARN LINSTOR/Satellite - SYSTEM - Not calling 'systemd-notify' as NOTIFY_SOCKET is null
linstor-satellite 06:46:25.315 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Controller connected and authenticated (192.168.1.103:43685)
linstor-satellite 06:46:25.408 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node1' created.
linstor-satellite 06:46:25.409 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node2' created.
linstor-satellite 06:46:25.409 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node3' created.
linstor-satellite 06:46:25.410 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'DfltDisklessStorPool' created.
linstor-satellite 06:46:25.410 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'lvm-thin' created.
linstor-satellite 06:46:25.524 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node1'.
linstor-satellite 06:46:25.524 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node2'.
linstor-satellite 06:46:25.524 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node3'.
linstor-satellite 06:46:25.525 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node1'.
linstor-satellite 06:46:25.525 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node2'.
linstor-satellite 06:46:25.525 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node3'.
linstor-satellite 06:46:25.526 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node1'.
linstor-satellite 06:46:25.526 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node2'.
linstor-satellite 06:46:25.526 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node3'.
linstor-satellite 06:46:25.528 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node1'.
linstor-satellite 06:46:25.528 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node2'.
linstor-satellite 06:46:25.528 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node3'.
linstor-satellite 06:46:25.529 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node1'.
linstor-satellite 06:46:25.529 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node2'.
linstor-satellite 06:46:25.529 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node3'.
linstor-satellite 06:46:25.951 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Controller connected and authenticated (192.168.1.103:25268)
linstor-satellite 06:46:26.001 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node1' created.
linstor-satellite 06:46:26.001 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node2' created.
linstor-satellite 06:46:26.001 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Node 'node3' created.
linstor-satellite 06:46:26.002 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'DfltDisklessStorPool' created.
linstor-satellite 06:46:26.002 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Storage pool 'lvm-thin' created.
linstor-satellite 06:46:26.122 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node1'.
linstor-satellite 06:46:26.122 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node2'.
linstor-satellite 06:46:26.122 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-051cfd78-022a-42e3-92be-3066b3ab59ce' created for node 'node3'.
linstor-satellite 06:46:26.124 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node1'.
linstor-satellite 06:46:26.124 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node2'.
linstor-satellite 06:46:26.124 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-2d1ab3e7-7ed6-499b-b508-cb5c6ef4f321' created for node 'node3'.
linstor-satellite 06:46:26.125 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node1'.
linstor-satellite 06:46:26.125 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node2'.
linstor-satellite 06:46:26.125 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-a0e4cdc8-b26f-43df-9115-32b50df427a6' created for node 'node3'.
linstor-satellite 06:46:26.126 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node1'.
linstor-satellite 06:46:26.126 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node2'.
linstor-satellite 06:46:26.126 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-d94b1746-ee9a-4b0b-a4bc-01e194f99b1c' created for node 'node3'.
linstor-satellite 06:46:26.127 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node1'.
linstor-satellite 06:46:26.127 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node2'.
linstor-satellite 06:46:26.127 [MainWorkerPool-1] INFO LINSTOR/Satellite - SYSTEM - Resource 'pvc-f2160f11-fd0e-40b2-ae1f-8459ec1cb3d8' created for node 'node3'.
And lvs
reports the same data percentages on every node.
Should I periodically run restore from a cron job at every 1h or something? :grinning:
That's certainly good to know, that that solves the issue... If I may ask, how long do you keep the node offline when testing? I want to reproduce this, and most likely I restart the machine to soon.
I observed it go offline in this comment https://github.com/piraeusdatastore/piraeus-operator/issues/142#issuecomment-756577096 after setting it up in the previous morning, so around 22h hours passed between last observed perfectly fine and first observed being offline. Then I restored it in this comment https://github.com/piraeusdatastore/piraeus-operator/issues/142#issuecomment-757107428 after another day of being offline. I'm on mobile now, so I can't see the exact timestamps but usually I have time to experiment in the morning before work.
Edit: After writing the above, I got the feeling you ment keeping the k8s node offline when doing a restart, not the linstor node offline state 😃 If this is the case, I have two answers based on the reboot time:
When the original problem happened first, when DRBD messed up with the kernel module injector and the OS package, I did a host restart with shutdown -h now
. Proxmox takes care of shutting down all the VMs and CTs in orderly fashion, then I also switched off the PSU for about 10s to power down the BMC unit as well. Then switch on PSU and power on the machine, and Proxmox restored all virtual servers to running state which was marked for start on boot. Then I started to see the IO delay spiking and all the previous problems with stuck PVC consumer pods, extremely slow lvs command, and all.
Last time, when everything seemed fine with just using the kernel module injector, I just did a reboot command over SSH. CentOS is pretty quick to come back interactive, so it is about 15-20s tops. I rebooted all the nodes, not only the misbehaving one, with quick ssh nodename -- reboot
for all three nodes. I didn't bother doing k8s cordon/drain/uncordon, just hit SSH reboot as the pods couldn't have moved anywhere else if all nodes reboot at the same time. Then all seemed fine, then the next morning, I saw the one volume being in unknown state.
In the meantime, I also did a host reboot this weekend due to kernel upgrade, and all Piraeus services come online fine, and it has been fine since then. I don't have a clue why last time I needed to do that linstor node restore and now don't.
I was wondering if there is a way to safely mount a Piraeus PV on the host for init or backup access. By safely, I mean that without breaking any replication logic provided by Linstor/DRBD.
When migrating workloads from host path volumes to Piraeus PVs, it would be great to easily copy existing data into the new PVs, or it could also provide backup access. I think backup would be less problematic as it is just reading data, on the other hand, initializing them is lots of writing operations.
I tried to mount the new logical volumes in multiple ways without success:
As a workaround, I mounted both the new Piraeus PV and the old host path PV into the pod under migration, and copied the old data to the new PV within the pod's shell. Then removed the old host path PV, and redeployed the pod with using only the new Piraeus PV. I think this workaround satisfies the "safe" requirement as all LV access is made through the CSI, however, it is a bit of a hassle to do it every time one needs to copy something to or from the LV.
Isn't there an easier but a still safe way to achieve the same from one of the k8s nodes or from the machine hosting the nodes?