xetys / hetzner-kube

A CLI tool for provisioning kubernetes clusters on Hetzner Cloud
Apache License 2.0
746 stars 116 forks source link

Ceph requires a kernel >= 4.7 to be fully functional #63

Open Baughn opened 6 years ago

Baughn commented 6 years ago

See https://github.com/rook/rook/issues/1044

This should have been fixed, so perhaps we're installing a too-old version of Ceph/Rook, but in any case I was unable to mount a filesystem using rook-toolbox without first upgrading the kernel.

Upgrading to linux-image-virtual-hwe-16.04 / linux-headers-virtual-hwe-16.04 fixes it, but putting that (and the necessary reboot) in cloud-init makes cluster create fail. It would be good if that could be handled better.

xetys commented 6 years ago

What do you mean by cluster create fails after kernel update and reboot. Are there any other approachs to handle this accurate?

Baughn commented 6 years ago

I need a newer kernel than the image comes with in order to mount Ceph filesystems, which means the machine needs to be rebooted... but putting a reboot command in cloud-init causes cluster creation to fail, reasonably enough.

The workaround is to put just the install command there, then manually reboot the cluster afterwards. That's inconvenient, though.

xetys commented 6 years ago

so, I think if this is the only way, we should consider performing the kernel upgrade when rook is installed the first time and when the matched kernel version is lower than 4.7. This would mean, install kernel, reboot nodes and wait for them to be able to connect and then run the current install steps.

I think someone should do this after #47 is resolved. I've already started the work on this

Baughn commented 6 years ago

In principle the kernel upgrade shouldn't be needed. I'm not sure why it is.

pierreozoux commented 6 years ago

Also, we'll need an upgrade command (for k8s and underlying OS). Once we have this command, this issue would be solved somehow. The only remaining question is should we perform the upgrade before the install.

As a side note, on Ubuntu to leverage the livepatch feature to avoid rebooting after kernel update, you need an Enterprise subscription...

xetys commented 6 years ago

Update here, I have an idea of what is wrong with kernel 4.4. There are several issues with MDS, RBD etc., like incomplete folders (especially in CephFS). However

apt install linux-image-4.10.0-28-generic linux-headers-4.10.0-28-generi && reboot 

fixes the problem. The question is, if and how to implement this in the tool. Or maybe just document this and close that issue

eliasp commented 6 years ago

It might be easier to rebase everything onto Ubuntu 18.04 (which should be done at some point anyways) instead of trying to get such an old Kernel working…

xetys commented 6 years ago

Actually I don't know how k8s itself is behaving on u18. I remember when u16 was released it had a lot of issues at the beginning. I should try that out when the e2e test suite is there

Elias Probst notifications@github.com schrieb am Fr., 11. Mai 2018, 18:08:

It might be easier to rebase everything onto Ubuntu 18.04 instead of trying to get such an old Kernel working…

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/xetys/hetzner-kube/issues/63#issuecomment-388409338, or mute the thread https://github.com/notifications/unsubscribe-auth/ACoVc8jddGc-fvMntYKEayIfRv5Kl_ZEks5txbdggaJpZM4SjhM3 .