Closed immanuelfodor closed 1 year ago
It appears that Red Hat have made some changes in kernel version 4.18.0-257
which are not yet covered by the DRBD compatibility layer. In particular, they have removed the function blk_alloc_queue
. I suggest you downgrade to a kernel version supported by DRBD, such as 4.18.0-240
.
Based on my experience with Arch and since CentOS Stream is also a rolling release distro, it might not be easy 😃 but I'll look into it, thanks, never tried that under Stream, I indeed have the 4.18.0-257.el8.x86_64
kernel.
Do you have any info about when we can expect DRBD catching up with the changes? Or is there any related issue I could follow?
I don't have any experience with CentOS Stream yet, so I can't give you any pointers there.
Do you have any info about when we can expect DRBD catching up with the changes? Or is there any related issue I could follow?
I'm not aware of anyone working on this yet. You could ask on the drbd-user mailing list. Since DRBD is open source, you could fix it yourself :smiley: Otherwise it will probably wait until a LINBIT customer asks for it.
I've found this message in the archive (https://lists.linbit.com/pipermail/drbd-user/2020-December/025770.html) that lists the supported kernels and answers to a user with a different one:
So, in conclusion, you have 2 options:
a) Use one of the kernels we already support b) Figure out how to have DRBD build for your kernel yourself (it's not fun, take my word for it) c) Become a LINBIT customer and we will gladly do it for you :)
It seems I really need to figure the kernel downgrade out, otherwise, DRBD->Piraeus is dead for me and anyone using up-to-date CentOS Stream :disappointed: I really hope a customer steps in with a fleet of Stream servers as our savior :grinning:
It turns out, it's not that hard to downgrade, but that's the end of the story, no previous kernel is available on the new distro:
$ dnf downgrade kernel
Last metadata expiration check: 1:49:50 ago on Mon 21 Dec 2020 12:37:35 PM CET.
Package kernel of lowest version already installed, cannot downgrade it.
Dependencies resolved.
Nothing to do.
Complete!
That's interesting, because there are older kernel versions mentioned here: https://wiki.centos.org/Manuals/ReleaseNotes/CentOSStream
As a temporary fix, you could just manually install the packages from CentOS 8.
Yesss, that's it, now it compiles fine just as before. Thank you! The command I used:
dnf install http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-core-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-modules-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-tools-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-headers-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-devel-4.18.0-240.1.1.el8_3.x86_64.rpm http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/kernel-tools-libs-4.18.0-240.1.1.el8_3.x86_64.rpm
Then reboot. Now I just need to make sure not updating the kernel packages.
If you agree, I'd keep this issue open for visibility of the problem and the workaround, and could also be used as notification channel when DRBD is fixed in upstream.
DRBD will probably work with this particular 4.18.0-257 kernel at some point, since the breaking changes will end up in a RHEL kernel, which LINBIT supports. However, this problem will happen regularly with Stream, whenever a breaking change is introduced. It is unlikely that Stream will ever be supported by LINBIT, because it is too much work to keep up with the rolling release kernels. I doubt anyone else will put in this work either.
Hence I would say we should leave this ticket open until there's some documentation explaining what to do about CentOS Stream. The choices are to either explain how to install a stable kernel from elsewhere on Stream, or just say that Piraeus doesn't support Stream and recommend one of the other stable RHEL-downstream CentOS clones. Rocky Linux and Project Lenix are two projects to provide such a distribution. We'll have to see how this develops.
For anyone bumping into this later, these notes might help until DRBD supports v240+ kernels.
I managed to accidentally update the nodes with an Ansible role, then tried to reinstall kernel v240 with the above command (https://github.com/piraeusdatastore/piraeus-operator/issues/137#issuecomment-749001555) but it says it's already installed, and uname
says it's still using v259 after a reboot.
$ dnf list --installed | grep kernel
kernel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel.x86_64 4.18.0-257.el8 @baseos
kernel.x86_64 4.18.0-259.el8 @baseos
kernel-core.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-core.x86_64 4.18.0-257.el8 @baseos
kernel-core.x86_64 4.18.0-259.el8 @baseos
kernel-devel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-devel.x86_64 4.18.0-257.el8 @baseos
kernel-devel.x86_64 4.18.0-259.el8 @baseos
kernel-headers.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-modules.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-modules.x86_64 4.18.0-257.el8 @baseos
kernel-modules.x86_64 4.18.0-259.el8 @baseos
kernel-tools.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-tools-libs.x86_64 4.18.0-240.1.1.el8_3 @@commandline
Removing the old versions manually results in an error:
$ dnf remove kernel-4.18.0-259.el8 kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-4.18.0-257.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8
Error:
Problem: The operation would result in removing the following protected packages: kernel-core
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
Tried to remove the kernel metapackages but it still uses the newest kernel:
$ dnf remove kernel-4.18.0-259.el8 kernel-4.18.0-257.el8
...
$ dnf list --installed | grep kernel
kernel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-core.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-core.x86_64 4.18.0-257.el8 @baseos
kernel-core.x86_64 4.18.0-259.el8 @baseos
kernel-devel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-devel.x86_64 4.18.0-257.el8 @baseos
kernel-devel.x86_64 4.18.0-259.el8 @baseos
kernel-headers.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-modules.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-modules.x86_64 4.18.0-257.el8 @baseos
kernel-modules.x86_64 4.18.0-259.el8 @baseos
kernel-tools.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-tools-libs.x86_64 4.18.0-240.1.1.el8_3 @@commandline
$ reboot
$ uname -r
4.18.0-259.el8.x86_64
Adding --nobest --skip-broken
to the dnf remove
still results with the protected package error, so I tried to remove them with rpm
:
# the two kernel metapackages were already uninstalled, so run the uncommented command instead of the following:
# rpm -e kernel-4.18.0-259.el8 kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-4.18.0-257.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8
$ rpm -e kernel-core-4.18.0-259.el8 kernel-devel-4.18.0-259.el8 kernel-modules-4.18.0-259.el8 kernel-core-4.18.0-257.el8 kernel-devel-4.18.0-257.el8 kernel-modules-4.18.0-257.el8
$ dnf list --installed | grep kernel
kernel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-core.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-devel.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-headers.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-modules.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-tools.x86_64 4.18.0-240.1.1.el8_3 @@commandline
kernel-tools-libs.x86_64 4.18.0-240.1.1.el8_3 @@commandline
$ reboot
$ uname -r
4.18.0-240.1.1.el8_3.x86_64
And so it was fixed again. I sent a DRBD user email a few days ago asking for support for kernels above v240 but nobody has replied since then. I really hope this will get a permanent fix soon.
Until that happens, I've excluded all kernel packages from yum/dnf update for a more stable temporary solution:
$ dnf check-update
Last metadata expiration check: 1:52:32 ago on Sat 26 Dec 2020 09:36:12 AM CET.
kernel.x86_64 4.18.0-259.el8 baseos
kernel-core.x86_64 4.18.0-259.el8 baseos
kernel-devel.x86_64 4.18.0-259.el8 baseos
kernel-headers.x86_64 4.18.0-259.el8 baseos
kernel-modules.x86_64 4.18.0-259.el8 baseos
kernel-tools.x86_64 4.18.0-259.el8 baseos
kernel-tools-libs.x86_64 4.18.0-259.el8 baseos
Obsoleting Packages
kernel-headers.x86_64 4.18.0-259.el8 baseos
kernel-headers.x86_64 4.18.0-240.1.1.el8_3 @@commandline
$ cat /etc/yum.conf
[main]
# BEGIN ANSIBLE MANAGED BLOCK - exclude kernel packages from update
# Temporary fix for Piraeus->Linstor->DRBD kernel support to stay on v240
# @see: https://github.com/piraeusdatastore/piraeus-operator/issues/137
exclude=kernel*
# END ANSIBLE MANAGED BLOCK - exclude kernel packages from update
...
$ dnf check-update
Last metadata expiration check: 3:18:25 ago on Sat 26 Dec 2020 09:36:12 AM CET.
$ # no kernel updates with the exclude present
I guess this issue is long been resolved.
I've upgraded my k8s nodes from CentOS 8 to CentOS Stream with the official recommended commands: https://www.centos.org/centos-stream/ Seemingly, there were only minor changes in package versions, nothing serious.
After rebooting the nodes, some of Piraeus' internal services wouldn't start up and are in a constant crash loop. It seems there is a problem with the
kernel-module-injector
container.I've attached all logs I could think of as relevant to let you solve this issue. Please advise if I should enable further debug options for Piraeus (and how).
Should a Piraeus git repo upgrade and redeploy solve it, maybe? As there were no releases in the meantime, I'm running the latest 1.2.0 version.
I've also seen https://github.com/piraeusdatastore/piraeus-operator/issues/134 and the mentioned files do not exist on the nodes, and also the make error seems to be different.