Closed ichundu closed 2 years ago
Hey @ichundu,
I would have assumed this is an issue with the host rather than with a specific Kubernetes version.
Some months ago I was able to get the linkerd up and running succesfully with the same setup (except kubernetes version was 1.20).
Was this also on Oracle Linux 8.5
? Looking at their documentation it seems that they don't use iptables as their firewall, have you tried following the steps in the docs to enable it on your host machines?
I'd first to try and see if enabling iptables explictly helps. If it does, then we know where the problem is :)
Hi @mateiidavid,
The documentation you linked is for RHEL 7 and explains the usage of ipset and iptables, I don't think it is relevant for my problem. I think the iptables binary is installed in the linkerd-init container image
Was this also on Oracle Linux 8.5? It may have been 8.4, I don't remember exactly. Oracle Linux is "nearly" identical to RHEL.
After some further research I think I have narrowed down the issue. The linkerd-init container is running image cr.l5d.io/linkerd/proxy-init:v1.4.0 which includes the legacy iptables binary. Red Hat 8 family of distributions (RHEL, CentOS, RockyLinux, AlmaLinux, Oracle Linux) use nf_tables instead and iptables is just a wrapper around nf_tables, see here. When RHEL 8 was released this caused issues with docker and other CNI plugins like calico that relied on legacy iptables but since then they have added support for iptables (nf_tables) on RHEL 8.
If this is the issue it means that linkerd-init container cannot work in any RHEL 8 based cluster even in Openshift which uses nftables. I found this other issue: Openshift 4.5 Install Fails on IPTables which has a similar problem in Openshift and the suggestion there is to use linkerd-cni.
I tried installing linkerd-cni daemon set, which removes the need for linkerd-init container and it is installing successfuly.
If you don't plan adding support for Red Hat distros without CNI plugin, you can close this issue.
Thank you!
Hey @ichundu, sorry about the resources, it was a pretty quick skim on my side. Great links attached btw, thanks for being so thorough.
Ah, that makes complete sense, re: using iptables-legacy. I knew there was a difference, but assumed all kernels would work with iptables-legacy
, it seems that I was wrong. I think turning that on in the kernel is going to be too complicated, I'd be happy to find a workaround for this.
Something that puzzled me, perhaps you have some insight. When using a CNI, does the CNI itself translate the iptables queries to nf_tables? Or does it use something akin to iptables in nf mode? I'd suppose the latter, but I wanted to double check. Perhaps the workaround could be as simple as using the iptables nf mode frontend in our init container to keep changes minimal but also ensure our forwarding rules work well in all modern kernels. I think the trend is to move away from iptables, it's just too big of a change from my pov to be worth it.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Hope this gets fixed. Otherwise, we either have to abondon Linkerd which we love or have to keep using Oracle Linux 7. :)
@tonychoe hey, we're working on improving the CNI plugin at the moment. We are considering re-writing the iptables queries to nftables, that should allow the init container to run Oracle Linux hosts. Have you considered using the CNI plugin as an alternative?
@mateiidavid thanks for the suggestion. I've started using the CNI plugin which worked well on both Oracle Linux 7 and 8!
In Linkerd 2.12, you can now configure proxy-init to use iptables-nft
: https://linkerd.io/2.12/features/nft/
What is the issue?
iptables command fails in linkerd-init container (in all linkerd pods) with the error message shown in the log snippet below. linkerd installation is not successful.
How can it be reproduced?
linkerd check --pre
linkerd install | kubectl apply -f -
Logs, error output, etc
linkerd-init container log:
output of
linkerd check --pre
command:output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
I have tried many combinations in my setup to but I always get the same error. I'm using cri-o as container runtime but have also tried with containerd and get the exact same result. I have tried versions 2.10, 2.11 and edge of linkerd without success. Some months ago I was able to get the linkerd up and running succesfully with the same setup (except kubernetes version was 1.20). I have tried installing from cli following the getting started guide and also official helm charts.
Would you like to work on fixing this bug?
No response