Closed davidnuzik closed 4 years ago
@ShylajaDevadiga I have assigned this issue to you for now. This will require some testing and discovery. We need to identify any/all CentOS issues that prevent us from formally supporting CentOS in our next release. Work with me as needed.
As a reminder we must support IPv6 as well.
iptables in Centos8 is now legacy.. they now use iptables-nft.
so on a centos 8 system, using iptables gives you this:
[root@mouse-r13 ~]# iptables -L -v -n
Chain INPUT (policy ACCEPT 154K packets, 264M bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 957 packets, 53559 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 58311 packets, 10M bytes)
pkts bytes target prot opt in out source destination
# Warning: iptables-legacy tables present, use iptables-legacy to see them
[root@mouse-r13 ~]# iptables -t nat -L -v -n
Chain PREROUTING (policy ACCEPT 990 packets, 73523 bytes)
pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 59 packets, 3564 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 693 packets, 45283 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 620 packets, 39433 bytes)
pkts bytes target prot opt in out source destination
# Warning: iptables-legacy tables present, use iptables-legacy to see them
and I believe firewalld is also not supported by k3s.
This doesn't mean k3s does not work, it's just not possible to see the iptables rules.
+1
Related #401 #1019
I have been documenting all the steps I needed to get it working in CentOS7, I'll gladly share those steps. It worked out of the box on a Google Cloud VM but not on a local, freshly installed instance. Namely, installation of iptables and removal of firewalld and wiping out reject rules from INPUT and FORWARD chains and installation of semanage. I'll gladly share these steps although my procedure is a little heavy-handed
Reader digest version: Don't use Centos v8 because of nft/legacy iptables problems.
So, to help clarify - in my reasearch, RHEL/Centos8 uses nft for iptables, not iptables. At this time, nft is not supported by Kubernetes. There is iptables/iptables-legacy support, and what will happen is the rules are still created and executed, but in RHEL/Centos8, they do not live in harmony with any other nft/iptables, unless it's the only ruleset you want to run.
You cannot see these iptables rule sets by default, since they occur in the legacy iptables rules due to the container having it's own iptables (not nft the binaries), and RHEL/Centos8 does not provide the legacy iptable tools.
There are other distributions heading towards using nft instead of iptables, but so far, it appears that they do include the legacy iptable binaries.
This means until nft is in Kubernetes (not k3s), RHEL/Centos8 and other distribution using nft tables is not truly supported.
Well, maybe it's not so bad for RHEL/Centos8..
@philipsparrow I would be interessed in the steps needed to make k3s works on CentOS 7.7. Even after a fresh install, removing firewalld, disabling SELinux, installing iptables-service, adding "user_namespace.enable=1" to the kernel command line, k3s is still not vworking... It's looking like a network issue as the API server isn't reachable.
@sraillard I wrote a step-by-step here, let me know if it works for you: https://github.com/rancher/k3s/issues/1019#issuecomment-593043089
I don't think I have anything as good as @Lohann has provided, I got it working with only the following steps (caveat: I don't need Traefik so haven't worked on that):
systemctl stop firewalld
systemctl disable firewalld
yum update
yum install -y iptables-services policycoreutils-python
systemctl start iptables
systemctl enable iptables
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
iptables -F
# This gets rid of any DROP rules in the INPUT and FORWARD chains
iptables-save > /etc/sysconfig/iptables
reboot now
Then I installed K3s with no special options.
FYI in my debugging, I found it enormously helpful to check both routes and firewall. Sometimes I was missing routes. ip a
and ip route
are your friend. From my cluster (single master, 2 worker nodes, flannel VXLAN) I expect to see routes that look like:
default via 10.126.126.1 dev eth0 proto dhcp metric 100
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
10.42.1.0/24 via 10.42.1.0 dev flannel.1 onlink
10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
10.126.126.1 dev eth0 proto dhcp scope link metric 100
10.126.126.3 dev eth0 proto kernel scope link src 10.126.126.3 metric 100
I hope this helps
Thank you @Lohann and @philipsparrow, I was able to make it work.
I'm not sure where the black magic is, I have installed the policycoreutils-python package and I have saved the iptables configuration once cleaned (I think I was missing that step).
To clean all iptables tables, I have used:
iptables -F
iptables -F -t nat
iptables -F -t mangle
Thank you @Lohann and @philipsparrow, I was able to make it work.
I'm not sure where the black magic is, I have installed the policycoreutils-python package and I have saved the iptables configuration once cleaned (I think I was missing that step).
To clean all iptables tables, I have used:
iptables -F iptables -F -t nat iptables -F -t mangle
That aught to do it, but check that on reboot your iptables rules aren't re-populated. Saving the configuration iptables-save > /etc/sysconfig/iptables
worked wonders for me.
I think the general idea here for iptables rules is to remove any DROP from the INPUT and FORWARD chains
We are still planning support in the v1.17.x scope however this is not going to make it in v1.17.4+k3s1. This will likely be in the next release.
@Lohann @philipsparrow, I was able to make work k3s on centos but I am facing following problem.
I have 3 master node with external etcd. (no worker nodes) and I have deployed admission controller on this k3s. What I have observed is, k3s server taking too long time (more than 1 minutes) to connect with admission control service if pod is running on another k3s server host.
It seems from k3s host, its taking time to connect with pod running on different k3s host using clusterIP (not pod ip) however if pod is running on same k3s host its not a problem.
@parekhha Does this issue happen only when this configuration is on CentOS? It doesn't sound like an OS issue (but I'm no expert).
@philipsparrow It get resolved after I updated kernel version.
if you install k3s on centos 7 the executeables get written to /usr/local/bin
however I installed it as root and /usr/local/bin
is normally not in the $PATH
variable of root. Maybe that is something to consider as well.
Also I'd like to add .. as of centos 7 dosen't seem to work for now, one could hack a Vagrantfile (with centos 7) or something together to have something to test with.
@Loki-Afro It is not true, /usr/local/bin
is in root's $PATH
by default. Here is $PATH
from fresh installation of centos 7: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
I've managed to install k3s on centos 7 (one node/master installation). The only problem (after node reboot) I have is with iptables and dial tcp 10.43.0.1:443: connect: no route to host
in coredns but flush of rules helps.
CentOS Linux release 7.7.1908 (Core)
3.10.0-1062.18.1.el7.x86_64
iptables v1.4.21
k3s version v1.17.3+k3s1 (5b17a175)
fwiw, Centos8 is a no go until we make some changes to k3s & dependencies also, I am guessing it will be awhile before all of that stuff is worked out.
We would like to support firewalld, ufw, and any other firewall. For the most part this just means adding docs on how to whitelist the cni interface & poke holes for the k3s api service or other services.
We have also added selinux support for containerd/cri and it is enabled by default in 1.17.4. This can cause issues as related here: https://github.com/rancher/k3s/issues/1583#issuecomment-605169698 (summary: use --disable-selinux for old behavior, or install the k3s-selinux policy & deal with selinux going forward).
I can confirm what @Drakkai is saying about the default path: when installed using the root user on CentOS7, the k3s executable and the kubectl command are working just after k3s is installed. And as suspected, I think it's more a firewall management issue.
@sraillard @Drakkai well that is really strange. But I wasn't the first one :(
https://serverfault.com/questions/833762/where-does-the-bash-path-on-centos-7-get-usr-local-bin-from
and https://bugs.centos.org/view.php?id=7492
basically there is some inconsistency when /usr/local/bin
is added to the PATH
and when not. But maybe one should keep that one in mind ...
From my personal experience with CentOS7 /usr/local/bin is not in the path. I have modified my provisioning with Vagrant to include it, but for the rpm version of k3s we plan to install to /usr/bin.
/usr/local/bin
isn't in root $PATH on Centos 7/8 by default, and isn't recommended to include it there.
More to the point, you're vulnerable if there is danger of bad stuff being installed at /usr/local/bin. By forcing yourself to use the full path (/usr/local/bin/whatever) you don't have any risk of accidentally invoking bad stuff via $PATH
I don't think I have anything as good as @Lohann has provided, I got it working with only the following steps (caveat: I don't need Traefik so haven't worked on that):
systemctl stop firewalld systemctl disable firewalld yum update yum install -y iptables-services policycoreutils-python systemctl start iptables systemctl enable iptables grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)" iptables -F # This gets rid of any DROP rules in the INPUT and FORWARD chains iptables-save > /etc/sysconfig/iptables reboot now
Then I installed K3s with no special options. FYI in my debugging, I found it enormously helpful to check both routes and firewall. Sometimes I was missing routes.
ip a
andip route
are your friend. From my cluster (single master, 2 worker nodes, flannel VXLAN) I expect to see routes that look like:default via 10.126.126.1 dev eth0 proto dhcp metric 100 10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1 10.42.1.0/24 via 10.42.1.0 dev flannel.1 onlink 10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink 10.126.126.1 dev eth0 proto dhcp scope link metric 100 10.126.126.3 dev eth0 proto kernel scope link src 10.126.126.3 metric 100
I hope this helps
I took the approach of
iptables -t nat -F
iptables -t mangle -F
iptables -F
iptables -X
service iptables save
to clear iptable rules.
I also installed these packaged via yum
so that the k3s-selinux policy would work.
In addition, I'd also encourage you to run on each node
sudo ethtool -K flannel.1 tx-checksum-ip-generic off
otherwise you will run into the problems coreos/flannel#1243 and #1638 with accessing applications in your cluster from worker nodes as described in these issues.
I found
sudo ethtool -K flannel.1 tx-checksum-ip-generic off
will not persist reboots, so you have to wrap in systemd service
/etc/systemd/system/flannel-tx-checksum-ip-generic-off.service
:
[Unit]
Description=Ensure TX (outgoing) checksum offloading is disabled on flannel.1
After=sys-devices-virtual-net-flannel.1.device
[Install]
WantedBy=sys-devices-virtual-net-flannel.1.device
[Service]
Type=oneshot
ExecStart=/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off
and then enable and start, but on reboots where the master comes up fine... the worker node will not come up okay upon reboot... the k3s service is running, but routes that should be there are missing... and i cannot access the docker registry hosted on master exposed via a loadBalancerIP via metallb. If swap out CentOS 7 for Ubuntu using my Ansible automation... the problem doesn't exist.
I am curious to see how this gets fixed in k8s and flannel. The core issue (according to https://github.com/kubernetes/kubernetes/issues/88986#issuecomment-620633097) is a bug in the kernel netfilter code that was exposed by some recent updates to k8s's netlink code, but the fix is unlikely to be back ported to RHEL7.
Hmm, I just restarted the k3s-agent service and problem went away... hmmm... My flannel-tx-checksum-ip-generic-off.service appears to kick off correctly, the node is added, etc. But w/o a restart this issue is not yet resolved. I may try something different. Interesting, @brandond.
Looks like flannel is going to just disable it automatically. https://github.com/coreos/flannel/pull/1282#issuecomment-617209151
'til a fix in the kernel shows up from RH. These don't usually come fast. So, now I gotta figure out how to replace the Flannel that ships with k3s with the patched one.
This:
Set FirewallBackend=iptables in /etc/firewalld/firewalld.conf and restart firewalld.
Seems to be needed.
Comes from: https://github.com/rancher/k3s/issues/1711
Adding my progress in testing with no changes made to iptables. As centos user /usr/local/bin is in PATH as root /usr/local/bin is not in PATH Node OS CentOS 7 k3s v1.18.4+k3s1 Rancher version 2.4.5 With selinux set to Enforcing mode:
Closing this issue as CentOS 7 validation is complete. Sonobuoy fails are tracked here #1960
Just note, basic k3s installation on CentOS7 fails because there is no selinux-policy-base
package.
After some digging around k3s depends on selinux-policy-targeted/minimum
on CentOS7.
This is not obvious so should be added to CentOS 7 validation learnings.
[vagrant@localhost ~]$ rpm -q --whatprovides selinux-policy-base
selinux-policy-targeted-3.13.1-266.el7_8.1.noarch
selinux-policy-minimum-3.13.1-266.el7_8.1.noarch
[vagrant@localhost ~]$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
@noelmcloughlin do the steps documented here not work for you? https://rancher.com/docs/k3s/latest/en/advanced/#experimental-selinux-support
Since that package provides selinux-policy-base you should be able to simply yum install
it as described.
@noelmcloughlin Trying to get more info on this. selinux-policy-targeted.noarch is already installed. So we don't explicitly need to install selinux-policy-base package.
Do you see k3s installation failing on CentOS Linux release 7.8.2003 (Core)?
rpm -q --whatprovides selinux-policy-base
selinux-policy-targeted-3.13.1-266.el7.noarch
yum list installed |grep selinux
libselinux.x86_64 2.5-15.el7 installed
libselinux-python.x86_64 2.5-15.el7 installed
libselinux-utils.x86_64 2.5-15.el7 installed
selinux-policy.noarch 3.13.1-266.el7 installed
selinux-policy-targeted.noarch 3.13.1-266.el7 installed
I am able to get it running using
yum install -y container-selinux
rpm -i https://rpm.rancher.io/k3s-selinux-0.1.1-rc1.el7.noarch.rpm
curl -sfL https://get.k3s.io | sh -
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-7566d596c8-vnhpv 1/1 Running 0 4m35s
kube-system local-path-provisioner-6d59f47c7-j4mvd 1/1 Running 0 4m35s
kube-system helm-install-traefik-hzflz 0/1 Completed 0 4m35s
kube-system svclb-traefik-q4tx2 2/2 Running 0 4m21s
kube-system coredns-8655855d6-f5qsc 1/1 Running 0 4m35s
kube-system traefik-758cd5fc85-nmglj 1/1 Running 0 4m21s
@noelmcloughlin do the steps documented here not work for you? selinux-policy-targeted.noarch is already installed. So we don't explicitly need to install selinux-policy-base package.
Those instructions are for selinux enforcing I guess.
I was testing with selinux permissive (i.e. not targeting selinux, just generic use case).
I think installing K3S via script on CentOS was failing because that package was missing. It was a few days ago when I was exploring the issue.
@noelmcloughlin When you have selinux set to permissive mode you can skip the installation of rpms by setting INSTALL_K3S_SELINUX_WARN=true.
curl -sfL https://get.k3s.io | INSTALL_K3S_SELINUX_WARN=true sh -s -
Thanks, I missed that one.
Yeah to me "a system where SELinux is enabled by default" means enforcing or permissive - not absent or disabled. Maybe worth a clarifying change to the docs?
I remember the issue now. Running the script failed. It did not say you should have selinux=enforcing or set INSTALL_K3S_SELINUX_WARN=true
but instead threw an error message saying "ensure selinux-policy-base
is installed" so that indicated a packaging problem, not a SELinux != enforcing issue. The script error confused me.
@ShylajaDevadiga I dont think I want this issue closed until CentOS 7 is 100% validated. I can't see that happening until the conformance tests pass cleanly and successfully on an officially release. So, I htink its fine that you opened an issues specifically for the conformance test failures, but this issue should be held open until that one works.
To be honest, I'm also not sure that we can claim cent 7 support without revisiting selinux.
Closing issue as conformance tests have passed.Results tracked in https://github.com/rancher/k3s/issues/1960.
I just recently deployed K3S to a CentOS7 server. K3S was installed but the pods were not able to communicate to the api server just like described before. I had to disable firewalld to get things working. How is this ticket closed if the latest K3S should work on a CentOS7 environment? Am I missing something?
IMO this is expected behavior if you have firewall enabled. Installation of K3s doesn’t handle complete server configuration (correct me if I am wrong).
That is correct. It works on RHEL7 if you don't break it by blocking traffic or doing other things that would prevent it from working.
I agree that k3s can't configure all the server settings. The fact is that firewalld is by default enabled, so that's classic issue (and many people have it). Maybe a solution could be checking some firewall rules and printing a warning if some rules may prevent k3s from working correctly?
For example, CentOS 7 AMIs (and I guess other cloud images) have firewalld disabled by default, but yeah, standard ISO installation has it enabled normally. But then, firewall could be also outside the server and also break K8s/K3s.
We need to expand our testing and identify any issues that prevent us from formally supporting CentOS. Keep in mind K3s is expected to work fine on CentOS 7. This issue is to track the testing effort required to formally support and certify the operating system (See https://rancher.com/docs/k3s/latest/en/installation/node-requirements/#operating-systems )
Currently there are existing issues with the os/centos label, but take care to note that these issues are not all necessarily caused just by utilizing CentOS. As such, it makes sense to review those GitHub issues, but we need to execute some testing and identify any other issues. As needed, we'll need to resolve these issues so we may fully support CentOS.
SELinux support is also needed, which is tracked separately here: https://github.com/rancher/k3s/issues/1372
gz#9311
gz#9743