puppetlabs / puppetlabs-kubernetes

This module install and configures a Kubernetes cluster
Apache License 2.0
92 stars 135 forks source link

General fixes #625

Closed r-tierney closed 1 year ago

r-tierney commented 1 year ago

This was tested on Debian bookworm with kubernetes version 1.26 and 1.27, calico v3.25

UdpIdleTimeout has been deprecated:

Kubernetes has moved the registry to:

Calico requires a v before the version number without it you get a 404 Example:

container-runtime remote has been deprecated as the only possible value was remote

discovery token from kubetool didnt work ( found that i needed to change rsa to pkey ) as we can see from 2 different clusters using the command with rsa gives the same result.

kubec01-rya.ops:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
kubec01-rya.ops:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl pkey -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
adcc248bb5ab39eb750a85f941ccfc6bfd1eef133a5aca57989ccda0eacedbdd
kubec01-rya.ops:/home/users/ryant#
kubec01-san:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
kubec01-san:/home/users/ryant# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl pkey -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
84dcf2e02d0c93f1cfb2fc7af1f7757dcd204c26a6ca286a620a2bd8dc6796c2
kubec01-san:/home/users/ryant#

I found that on Debian the kubelet would constantly crash as the kubelet's default cgroupDriver on Debian is set to systemd

ryant@kubec01-san:~$ cat /var/lib/kubelet/config.yaml | grep -i cgroup
cgroupDriver: systemd
ryant@kubec01-san:~$

and this modules default sets containerd's cgroup_driver to cgroupfs if its not running on redhat ( found in init.pp ) The fix for Debian ( Should this just be the default for both Debian and Redhat now? ) as recommended by kubernetes reference

class { '::kubernetes':
    cgroup_driver => 'systemd',
}

The above change sets the following in containerd's config which causes kubelet and containerd to work on Debian

ryant@kubec01-san:~$ cat /etc/containerd/config.toml | grep -i 'plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options' -A 1
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true
ryant@kubec01-san:~$

And lastly calico required the mount to be shared: Error reported

Apr 28 17:00:56 kubec01-san kubelet[11687]: E0428 17:00:56.926812   11687 remote_runtime.go:302] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to generate container \"866d1ce88b3fc72e4a455ec485dfee0e3ef7cdeb84bc8146dfb949d7c87ce267\" spec: failed to generate spec: path \"/sys/fs/\" is mounted on \"/sys\" but it is not a shared mount" podSandboxID="85df3c3dfa95f1878a4b328f93cbeb021a1df366fdc8214440cd64d44451a364"

The fix ( This solution requires the puppet module mount_core ):

  # For calico requires mount to be shared
  mount { "/" : atboot => yes, options => "rshared", name => "/", ensure => mounted, remounts => true, pass => "0" } ~>
  exec { "/usr/bin/mount --make-rshared /" : refreshonly => true }

Fixes https://github.com/puppetlabs/puppetlabs-kubernetes/issues/584

puppet-community-rangefinder[bot] commented 1 year ago

kubernetes is a class

Breaking changes to this file MAY impact these 5 modules (near match): * [jetstack-fluent_bit](https://github.com/jetstack/tarmak/tree/master/puppet/modules/fluent_bit) * [jetstack-tarmak](https://github.com/jetstack/tarmak/tree/master/puppet/modules/tarmak) * [jetstack-prometheus](https://github.com/jetstack/tarmak/tree/master/puppet/modules/prometheus) * [jetstack-calico](https://github.com/jetstack/tarmak/tree/master/puppet/modules/calico) * [jetstack-kubernetes_addons](https://github.com/jetstack/tarmak/tree/master/puppet/modules/kubernetes_addons)

This module is declared in 0 of 580 indexed public Puppetfiles.


These results were generated with Rangefinder, a tool that helps predict the downstream impact of breaking changes to elements used in Puppet modules. You can run this on the command line to get a full report.

Exact matches are those that we can positively identify via namespace and the declaring modules' metadata. Non-namespaced items, such as Puppet 3.x functions, will always be reported as near matches only.

CLAassistant commented 1 year ago

CLA assistant check
All committers have signed the CLA.

r-tierney commented 1 year ago

Discovery token fixed in: https://github.com/puppetlabs/puppetlabs-kubernetes/pull/627/commits/1871b5cc6ef1771cbb0d3c287ada675193c4f5fe

Using pkey instead of rsa

jordanbreen28 commented 1 year ago

@r-tierney this is brilliant - can you rebase the PR?

r-tierney commented 1 year ago

Thanks @jordanbreen28, I've updated this branch from main

jordanbreen28 commented 1 year ago

@r-tierney apologies... I missed the notification. Can you rebase once again and clean up the merge commits? Then we can get this progressed. Thanks

r-tierney commented 1 year ago

@jordanbreen28 Sure thing, updating now

r-tierney commented 1 year ago

rebase complete

jordanbreen28 commented 1 year ago

Nice one @r-tierney - I'll merge in once green! thanks again for this massive effort.

r-tierney commented 1 year ago

Not a problem at all, glad to help.

The issue which took the longest to troubleshoot was actually this modules default setting for the cgroup_driver located on line 741 of init.pp which had it set to cgroupfs by default instead of systemd and would cause a conflict with kubelet as kubelets default setting on Debian is systemd.

With the pods and kubelet crashlooping it took some time to work out that was the issue as without the kubelet running it's hard to run a kubectl describe etc to figure out why a pod is crashlooping.

I understand changing the default for a setting like this may break those not running systemd so I left it out of this pull request but thought I'd mention it anyway and leave the decision up to your team whether or not to change it or add a mention in some docs somewhere.

jordanbreen28 commented 1 year ago

Not a problem at all, glad to help.

The issue which took the longest to troubleshoot was actually this modules default setting for the cgroup_driver located on line 741 of init.pp which had it set to cgroupfs by default instead of systemd and would cause a conflict with kubelet as kubelets default setting on Debian is systemd.

With the pods and kubelet crashlooping it took some time to work out that was the issue as without the kubelet running it's hard to run a kubectl describe etc to figure out why a pod is crashlooping.

I understand changing the default for a setting like this may break those not running systemd so I left it out of this pull request but thought I'd mention it anyway and leave the decision up to your team whether or not to change it or add a mention in some docs somewhere.

Yeah the removal of cgroupfs as the default driver would need to be part of a major release due to the high possibility it may break things, we would need to document this also. Systemd is now the recommended for both debian and rhel based distros, so should probably be progressed in the next major release.

If you want to go ahead and create a seperate PR for that, I will try to ensure its included in the next major release (which should be in the next week or two due to puppet 8).

Anyways, happy to merge this! 🥇