piraeusdatastore / piraeus-operator

The Piraeus Operator manages LINSTOR clusters in Kubernetes.
https://piraeus.io/
Apache License 2.0
406 stars 63 forks source link

Piraeus won't start on CentOS Stream #137

Closed immanuelfodor closed 1 year ago

immanuelfodor commented 3 years ago

I've upgraded my k8s nodes from CentOS 8 to CentOS Stream with the official recommended commands: https://www.centos.org/centos-stream/ Seemingly, there were only minor changes in package versions, nothing serious.

After rebooting the nodes, some of Piraeus' internal services wouldn't start up and are in a constant crash loop. It seems there is a problem with the kernel-module-injector container.

I've attached all logs I could think of as relevant to let you solve this issue. Please advise if I should enable further debug options for Piraeus (and how).

$ k get all
NAME                                             READY   STATUS                  RESTARTS   AGE
pod/piraeus-op-cs-controller-cfb475c85-cngdm     1/1     Running                 3          47m
pod/piraeus-op-csi-controller-6fb7f7c5d6-hmspq   6/6     Running                 11         51m
pod/piraeus-op-csi-node-c94w2                    3/3     Running                 12         5d23h
pod/piraeus-op-csi-node-mcvw6                    3/3     Running                 10         5d23h
pod/piraeus-op-csi-node-vk9nj                    3/3     Running                 11         5d23h
pod/piraeus-op-etcd-0                            1/1     Running                 3          5d23h
pod/piraeus-op-etcd-1                            1/1     Running                 1          66m
pod/piraeus-op-etcd-2                            1/1     Running                 1          57m
pod/piraeus-op-ns-node-7lqmk                     0/1     Init:CrashLoopBackOff   5          7m7s
pod/piraeus-op-ns-node-djmtm                     0/1     Init:CrashLoopBackOff   5          7m10s
pod/piraeus-op-ns-node-wlnsj                     0/1     Init:CrashLoopBackOff   5          7m8s
pod/piraeus-op-operator-7466ddd49c-bbkgm         1/1     Running                 6          58m

NAME                      TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)             AGE
service/piraeus-op-cs     ClusterIP   10.43.60.86   <none>        3370/TCP            18d
service/piraeus-op-etcd   ClusterIP   None          <none>        2380/TCP,2379/TCP   18d

NAME                                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/piraeus-op-csi-node   3         3         3       3            3           <none>          18d
daemonset.apps/piraeus-op-ns-node    3         3         0       3            0           <none>          18d

NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/piraeus-op-cs-controller    1/1     1            1           18d
deployment.apps/piraeus-op-csi-controller   1/1     1            1           18d
deployment.apps/piraeus-op-operator         1/1     1            1           18d

NAME                                                   DESIRED   CURRENT   READY   AGE
replicaset.apps/piraeus-op-cs-controller-54b4444965    0         0         0       5d23h
replicaset.apps/piraeus-op-cs-controller-7fb6c98656    0         0         0       18d
replicaset.apps/piraeus-op-cs-controller-cfb475c85     1         1         1       5d23h
replicaset.apps/piraeus-op-csi-controller-565c954d87   0         0         0       18d
replicaset.apps/piraeus-op-csi-controller-6fb7f7c5d6   1         1         1       51m
replicaset.apps/piraeus-op-operator-7466ddd49c         1         1         1       5d23h
replicaset.apps/piraeus-op-operator-7bc4759c9d         0         0         0       18d

NAME                               READY   AGE
statefulset.apps/piraeus-op-etcd   3/3     18d

NAME                                     COMPLETIONS   DURATION   AGE
job.batch/linstor-etcd-rke-node1-chown   1/1           2s         18d
job.batch/linstor-etcd-rke-node2-chown   1/1           2s         18d
job.batch/linstor-etcd-rke-node3-chown   1/1           1s         18d
job.batch/piraeus-op-test-cs-svc         1/1           38s        18d

$ k logs daemonset.apps/piraeus-op-ns-node kernel-module-injector --previous
Found 3 pods, using pod/piraeus-op-ns-node-djmtm
Need a git checkout to regenerate drbd/.drbd_git_revision
make[1]: Entering directory '/tmp/pkg/drbd-9.0.25-1/drbd'

    Calling toplevel makefile of kernel source tree, which I believe is in
    KDIR=/lib/modules/4.18.0-257.el8.x86_64/build

make -C /lib/modules/4.18.0-257.el8.x86_64/build   M=/tmp/pkg/drbd-9.0.25-1/drbd  modules
  COMPAT  alloc_workqueue_takes_fmt
  COMPAT  before_4_13_kernel_read
  COMPAT  blkdev_issue_zeroout_discard
  COMPAT  drbd_release_returns_void
  COMPAT  genl_policy_in_ops
  COMPAT  have_SHASH_DESC_ON_STACK
  COMPAT  have_WB_congested_enum
  COMPAT  have_allow_kernel_signal
  COMPAT  have_atomic_dec_if_positive_linux
  COMPAT  have_atomic_in_flight
  COMPAT  have_bd_claim_by_disk
  COMPAT  have_bd_unlink_disk_holder
  COMPAT  have_bio_bi_bdev
  COMPAT  have_bio_bi_error
  COMPAT  have_bio_bi_opf
  COMPAT  have_bio_bi_status
  COMPAT  have_bio_clone_fast
  COMPAT  have_bio_flush
  COMPAT  have_bio_free
  COMPAT  have_bio_op_shift
  COMPAT  have_bio_rw
  COMPAT  have_bio_set_op_attrs
  COMPAT  have_bio_start_io_acct
  COMPAT  have_bioset_create_front_pad
  COMPAT  have_bioset_init
  COMPAT  have_bioset_need_bvecs
  COMPAT  have_blk_check_plugged
  COMPAT  have_blk_qc_t_make_request
  COMPAT  have_blk_queue_flag_set
  COMPAT  have_blk_queue_make_request
  COMPAT  have_blk_queue_merge_bvec
  COMPAT  have_blk_queue_plugged
  COMPAT  have_blk_queue_split_q_bio
  COMPAT  have_blk_queue_split_q_bio_bioset
  COMPAT  have_blk_queue_write_cache
  COMPAT  have_blkdev_get_by_path
  COMPAT  have_d_inode
  COMPAT  have_file_inode
  COMPAT  have_generic_start_io_acct_q_rw_sect_part
  COMPAT  have_generic_start_io_acct_rw_sect_part
  COMPAT  have_genl_family_parallel_ops
  COMPAT  have_ib_cq_init_attr
  COMPAT  have_ib_get_dma_mr
  COMPAT  have_idr_alloc
  COMPAT  have_idr_is_empty
  COMPAT  have_inode_lock
  COMPAT  have_ktime_to_timespec64
  COMPAT  have_kvfree
  COMPAT  have_max_send_recv_sge
  COMPAT  have_netlink_cb_portid
  COMPAT  have_nla_nest_start_noflag
  COMPAT  have_nla_parse_deprecated
  COMPAT  have_nla_put_64bit
  COMPAT  have_part_stat_h
  COMPAT  have_pointer_backing_dev_info
  COMPAT  have_prandom_u32
  COMPAT  have_proc_create_single
  COMPAT  have_ratelimit_state_init
  COMPAT  have_rb_augment_functions
  COMPAT  have_refcount_inc
  COMPAT  have_req_hardbarrier
  COMPAT  have_req_noidle
  COMPAT  have_req_nounmap
  COMPAT  have_req_op_write
  COMPAT  have_req_op_write_same
  COMPAT  have_req_op_write_zeroes
  COMPAT  have_req_prio
  COMPAT  have_req_write
  COMPAT  have_req_write_same
  COMPAT  have_security_netlink_recv
  COMPAT  have_shash_desc_zero
  COMPAT  have_signed_nla_put
  COMPAT  have_simple_positive
  COMPAT  have_struct_bvec_iter
  COMPAT  have_struct_kernel_param_ops
  COMPAT  have_struct_size
  COMPAT  have_time64_to_tm
  COMPAT  have_timer_setup
  COMPAT  have_void_make_request
  COMPAT  hlist_for_each_entry_has_three_parameters
  COMPAT  ib_alloc_pd_has_2_params
  COMPAT  ib_device_has_ops
  COMPAT  ib_post_send_const_params
  COMPAT  ib_query_device_has_3_params
  COMPAT  kmap_atomic_page_only
  COMPAT  need_make_request_recursion
  COMPAT  queue_limits_has_discard_zeroes_data
  COMPAT  rdma_create_id_has_net_ns
  COMPAT  sock_create_kern_has_five_parameters
  COMPAT  sock_ops_returns_addr_len
  UPD     /tmp/pkg/drbd-9.0.25-1/drbd/compat.4.18.0-257.el8.x86_64.h
  UPD     /tmp/pkg/drbd-9.0.25-1/drbd/compat.h
./drbd-kernel-compat/gen_compat_patch.sh: line 12: spatch: command not found
./drbd-kernel-compat/gen_compat_patch.sh: line 45: hash: spatch: not found
  INFO: no suitable spatch found; trying spatch-as-a-service;
  be patient, may take up to 10 minutes
  if it is in the server side cache it might only take a second
  SPAAS    1c20515525cffc698b58b76a5d936660
Successfully connected to SPAAS ('d35a4b17210dab1336de2725b997f300e9acd297')
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5591  100   772    0  4819   8674  54146 --:--:-- --:--:-- --:--:-- 62820
  You can create a new .tgz including this pre-computed compat patch
  by calling "make unpatch ; echo drbd-9.0.25-1/drbd/drbd-kernel-compat/cocci_cache/1c20515525cffc698b58b76a5d936660/compat.patch >>.filelist ; make tgz"
  PATCH
patching file drbd_sender.c
patching file drbd_debugfs.c
patching file drbd_receiver.c
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_dax_pmem.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_debugfs.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_bitmap.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_proc.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_sender.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_receiver.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_req.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_actlog.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/lru_cache.o
  CC [M]  /tmp/pkg/drbd-9.0.25-1/drbd/drbd_main.o
/tmp/pkg/drbd-9.0.25-1/drbd/drbd_main.c: In function 'drbd_create_device':
/tmp/pkg/drbd-9.0.25-1/drbd/drbd_main.c:3713:6: error: implicit declaration of function 'blk_alloc_queue'; did you mean 'blk_alloc_queue_rh'? [-Werror=implicit-function-declaration]
  q = blk_alloc_queue(drbd_make_request, NUMA_NO_NODE);
      ^~~~~~~~~~~~~~~
      blk_alloc_queue_rh
/tmp/pkg/drbd-9.0.25-1/drbd/drbd_main.c:3713:4: warning: assignment to 'struct request_queue *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
  q = blk_alloc_queue(drbd_make_request, NUMA_NO_NODE);
    ^
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:316: /tmp/pkg/drbd-9.0.25-1/drbd/drbd_main.o] Error 1
make[2]: *** [Makefile:1545: _module_/tmp/pkg/drbd-9.0.25-1/drbd] Error 2
make[1]: Leaving directory '/tmp/pkg/drbd-9.0.25-1/drbd'
make[1]: *** [Makefile:132: kbuild] Error 2
make: *** [Makefile:135: module] Error 2

Could not find the expexted *.ko, see stderr for more details

$ k version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-18T12:09:25Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:09:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

$ docker version
Client: Docker Engine - Community
 Version:           20.10.1
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        831ebea
 Built:             Tue Dec 15 04:34:30 2020
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.1
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       f001486
  Built:            Tue Dec 15 04:32:21 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

$ cat /etc/centos-release        
CentOS Stream release 8

$ cat /etc/os-release 
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

$ dnf install -y kmod-drbd90 drbd90-utils
Last metadata expiration check: 1:14:10 ago on Sat 19 Dec 2020 07:45:54 AM CET.
Package kmod-drbd90-9.0.25-2.el8_3.elrepo.x86_64 is already installed.
Package drbd90-utils-9.13.1-1.el8.elrepo.x86_64 is already installed.
Dependencies resolved.
Nothing to do.
Complete!

$ dnf info kmod-drbd90 drbd90-utils
Last metadata expiration check: 1:15:29 ago on Sat 19 Dec 2020 07:45:54 AM CET.
Installed Packages
Name         : drbd90-utils
Version      : 9.13.1
Release      : 1.el8.elrepo
Architecture : x86_64
Size         : 5.8 M
Source       : drbd90-utils-9.13.1-1.el8.elrepo.src.rpm
Repository   : @System
From repo    : elrepo
Summary      : Management utilities for DRBD
URL          : http://www.drbd.org/
License      : GPLv2+
Description  : DRBD mirrors a block device over the network to another machine.
             : Think of it as networked raid 1. It is a building block for
             : setting up high availability (HA) clusters.
             : 
             : This packages includes the DRBD administration tools and integration
             : scripts for heartbeat, pacemaker, rgmanager and xen.

Name         : kmod-drbd90
Version      : 9.0.25
Release      : 2.el8_3.elrepo
Architecture : x86_64
Size         : 1.3 M
Source       : kmod-drbd90-9.0.25-2.el8_3.elrepo.src.rpm
Repository   : @System
From repo    : elrepo
Summary      : drbd90 kernel module(s)
URL          : http://www.drbd.org/
License      : GPLv2
Description  : DRBD is a distributed replicated block device. It mirrors a
             : block device over the network to another machine. Think of it
             : as networked raid 1. It is a building block for setting up
             : high availability (HA) clusters.
             : This package provides the drbd90 kernel module(s).
             : It is built to depend upon the specific ABI provided by a range of releases
             : of the same variant of the Linux kernel and not on any one specific build.

$ git log
commit 4ee8b6e6a556cb64877a966bd857050b00834caa (HEAD -> master, origin/master, origin/HEAD)
Author: Moritz "WanzenBug" Wanzenböck <...>
Date:   Wed Nov 18 13:59:08 2020 +0100

    Prepare next dev cycle

commit 5068780fda8ce603a6ea32ee70b57b4e6b4e1f23 (tag: v1.2.0)
Author: Moritz "WanzenBug" Wanzenböck <...>
Date:   Wed Nov 18 13:58:51 2020 +0100

    Release v1.2.0

Should a Piraeus git repo upgrade and redeploy solve it, maybe? As there were no releases in the meantime, I'm running the latest 1.2.0 version.

I've also seen https://github.com/piraeusdatastore/piraeus-operator/issues/134 and the mentioned files do not exist on the nodes, and also the make error seems to be different.

$ cat /sys/module/drbd/parameters/usermode_helper
cat: /sys/module/drbd/parameters/usermode_helper: No such file or directory

$ cat /etc/modprobe.d/drbd.conf
cat: /etc/modprobe.d/drbd.conf: No such file or directory
JoelColledge commented 3 years ago

It appears that Red Hat have made some changes in kernel version 4.18.0-257 which are not yet covered by the DRBD compatibility layer. In particular, they have removed the function blk_alloc_queue. I suggest you downgrade to a kernel version supported by DRBD, such as 4.18.0-240.

immanuelfodor commented 3 years ago

Based on my experience with Arch and since CentOS Stream is also a rolling release distro, it might not be easy 😃 but I'll look into it, thanks, never tried that under Stream, I indeed have the 4.18.0-257.el8.x86_64 kernel.

Do you have any info about when we can expect DRBD catching up with the changes? Or is there any related issue I could follow?

JoelColledge commented 3 years ago

I don't have any experience with CentOS Stream yet, so I can't give you any pointers there.

Do you have any info about when we can expect DRBD catching up with the changes? Or is there any related issue I could follow?

I'm not aware of anyone working on this yet. You could ask on the drbd-user mailing list. Since DRBD is open source, you could fix it yourself :smiley: Otherwise it will probably wait until a LINBIT customer asks for it.

immanuelfodor commented 3 years ago

I've found this message in the archive (https://lists.linbit.com/pipermail/drbd-user/2020-December/025770.html) that lists the supported kernels and answers to a user with a different one:

So, in conclusion, you have 2 options:

a) Use one of the kernels we already support b) Figure out how to have DRBD build for your kernel yourself (it's not fun, take my word for it) c) Become a LINBIT customer and we will gladly do it for you :)

It seems I really need to figure the kernel downgrade out, otherwise, DRBD->Piraeus is dead for me and anyone using up-to-date CentOS Stream :disappointed: I really hope a customer steps in with a fleet of Stream servers as our savior :grinning:

immanuelfodor commented 3 years ago

It turns out, it's not that hard to downgrade, but that's the end of the story, no previous kernel is available on the new distro:

$ dnf downgrade kernel
Last metadata expiration check: 1:49:50 ago on Mon 21 Dec 2020 12:37:35 PM CET.
Package kernel of lowest version already installed, cannot downgrade it.
Dependencies resolved.
Nothing to do.
Complete!
JoelColledge commented 3 years ago

That's interesting, because there are older kernel versions mentioned here: https://wiki.centos.org/Manuals/ReleaseNotes/CentOSStream

As a temporary fix, you could just manually install the packages from CentOS 8.

immanuelfodor commented 3 years ago

Yesss, that's it, now it compiles fine just as before. Thank you! The command I used:

dnf install http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-core-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-modules-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-tools-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-headers-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-devel-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-tools-libs-4.18.0-240.1.1.el8_3.x86_64.rpm

Then reboot. Now I just need to make sure not updating the kernel packages.

If you agree, I'd keep this issue open for visibility of the problem and the workaround, and could also be used as notification channel when DRBD is fixed in upstream.

JoelColledge commented 3 years ago

DRBD will probably work with this particular 4.18.0-257 kernel at some point, since the breaking changes will end up in a RHEL kernel, which LINBIT supports. However, this problem will happen regularly with Stream, whenever a breaking change is introduced. It is unlikely that Stream will ever be supported by LINBIT, because it is too much work to keep up with the rolling release kernels. I doubt anyone else will put in this work either.

Hence I would say we should leave this ticket open until there's some documentation explaining what to do about CentOS Stream. The choices are to either explain how to install a stable kernel from elsewhere on Stream, or just say that Piraeus doesn't support Stream and recommend one of the other stable RHEL-downstream CentOS clones. Rocky Linux and Project Lenix are two projects to provide such a distribution. We'll have to see how this develops.

immanuelfodor commented 3 years ago

For anyone bumping into this later, these notes might help until DRBD supports v240+ kernels.

I managed to accidentally update the nodes with an Ansible role, then tried to reinstall kernel v240 with the above command (https://github.com/piraeusdatastore/piraeus-operator/issues/137#issuecomment-749001555) but it says it's already installed, and uname says it's still using v259 after a reboot.

$ dnf list --installed | grep kernel
kernel.x86_64                                 4.18.0-240.1.1.el8_3                    @@commandline    
kernel.x86_64                                 4.18.0-257.el8                          @baseos          
kernel.x86_64                                 4.18.0-259.el8                          @baseos          
kernel-core.x86_64                            4.18.0-240.1.1.el8_3                    @@commandline    
kernel-core.x86_64                            4.18.0-257.el8                          @baseos          
kernel-core.x86_64                            4.18.0-259.el8                          @baseos          
kernel-devel.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-devel.x86_64                           4.18.0-257.el8                          @baseos          
kernel-devel.x86_64                           4.18.0-259.el8                          @baseos          
kernel-headers.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-modules.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-modules.x86_64                         4.18.0-257.el8                          @baseos          
kernel-modules.x86_64                         4.18.0-259.el8                          @baseos          
kernel-tools.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-tools-libs.x86_64                      4.18.0-240.1.1.el8_3                    @@commandline

Removing the old versions manually results in an error:

$ dnf remove kernel-4.18.0-259.el8 kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-4.18.0-257.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8
Error: 
 Problem: The operation would result in removing the following protected packages: kernel-core
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

Tried to remove the kernel metapackages but it still uses the newest kernel:

$ dnf remove kernel-4.18.0-259.el8 kernel-4.18.0-257.el8
...

$ dnf list --installed | grep kernel
kernel.x86_64                                 4.18.0-240.1.1.el8_3                    @@commandline    
kernel-core.x86_64                            4.18.0-240.1.1.el8_3                    @@commandline    
kernel-core.x86_64                            4.18.0-257.el8                          @baseos          
kernel-core.x86_64                            4.18.0-259.el8                          @baseos          
kernel-devel.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-devel.x86_64                           4.18.0-257.el8                          @baseos          
kernel-devel.x86_64                           4.18.0-259.el8                          @baseos          
kernel-headers.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-modules.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-modules.x86_64                         4.18.0-257.el8                          @baseos          
kernel-modules.x86_64                         4.18.0-259.el8                          @baseos          
kernel-tools.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-tools-libs.x86_64                      4.18.0-240.1.1.el8_3                    @@commandline

$ reboot

$ uname -r
4.18.0-259.el8.x86_64

Adding --nobest --skip-broken to the dnf remove still results with the protected package error, so I tried to remove them with rpm:

# the two kernel metapackages were already uninstalled, so run the uncommented command instead of the following: 
# rpm -e kernel-4.18.0-259.el8 kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-4.18.0-257.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8
$ rpm -e kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8

$ dnf list --installed | grep kernel
kernel.x86_64                                 4.18.0-240.1.1.el8_3                    @@commandline    
kernel-core.x86_64                            4.18.0-240.1.1.el8_3                    @@commandline    
kernel-devel.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-headers.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-modules.x86_64                         4.18.0-240.1.1.el8_3                    @@commandline    
kernel-tools.x86_64                           4.18.0-240.1.1.el8_3                    @@commandline    
kernel-tools-libs.x86_64                      4.18.0-240.1.1.el8_3                    @@commandline

$ reboot

$ uname -r
4.18.0-240.1.1.el8_3.x86_64

And so it was fixed again. I sent a DRBD user email a few days ago asking for support for kernels above v240 but nobody has replied since then. I really hope this will get a permanent fix soon.

Until that happens, I've excluded all kernel packages from yum/dnf update for a more stable temporary solution:

$ dnf check-update
Last metadata expiration check: 1:52:32 ago on Sat 26 Dec 2020 09:36:12 AM CET.

kernel.x86_64                                                                                            4.18.0-259.el8                                                                                               baseos       
kernel-core.x86_64                                                                                       4.18.0-259.el8                                                                                               baseos       
kernel-devel.x86_64                                                                                      4.18.0-259.el8                                                                                               baseos       
kernel-headers.x86_64                                                                                    4.18.0-259.el8                                                                                               baseos       
kernel-modules.x86_64                                                                                    4.18.0-259.el8                                                                                               baseos       
kernel-tools.x86_64                                                                                      4.18.0-259.el8                                                                                               baseos       
kernel-tools-libs.x86_64                                                                                 4.18.0-259.el8                                                                                               baseos       
Obsoleting Packages
kernel-headers.x86_64                                                                                    4.18.0-259.el8                                                                                               baseos       
    kernel-headers.x86_64                                                                                4.18.0-240.1.1.el8_3                                                                                         @@commandline

$ cat /etc/yum.conf 
[main]
# BEGIN ANSIBLE MANAGED BLOCK - exclude kernel packages from update
# Temporary fix for Piraeus->Linstor->DRBD kernel support to stay on v240
# @see: https://github.com/piraeusdatastore/piraeus-operator/issues/137
exclude=kernel*
# END ANSIBLE MANAGED BLOCK - exclude kernel packages from update
...

$ dnf check-update
Last metadata expiration check: 3:18:25 ago on Sat 26 Dec 2020 09:36:12 AM CET.
$ # no kernel updates with the exclude present
WanzenBug commented 1 year ago

I guess this issue is long been resolved.