Closed ichasco-heytrade closed 1 year ago
Are you still encountering this issue on 1.23?
I was able to reproduce this issue on 1.23 as well
Something I haven't yet determined when running bootstrap.sh in docker mode is setting net.ipv4.ip_forward=1
already, while it isn't in containerd mode, which means this control was never doing anything in the first place originally. I'm mitigating this in the meantime by just commenting out this control, as it was getting canceled out later on in the past when using docker.
I'm going to try to reproduce this in all our supported EKS versions. Do you have an example of how you're creating the cluster/nodes (eg eksctl config, terraform)?
I was using terraform for the nodes. If you're able to reproduce it, you should see nodes running in docker mode
# cat /proc/sys/net/ipv4/ip_forward
1
and in containerd mode
# cat /proc/sys/net/ipv4/ip_forward
0
I provisioned a cluster with this config using eksctl
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: container-runtime-test
region: us-east-2
nodeGroups:
- name: ng-1
instanceType: m5.xlarge
desiredCapacity: 2
amiFamily: AmazonLinux2
containerRuntime: containerd
and verified the version
k version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.23.14-eks-ffeb93d
I SSHd to one of the nodes and checked containerd was running
sudo systemctl status containerd
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/containerd.service.d
└─10-compat-symlink.conf
Active: active (running) since Thu 2023-01-19 22:02:35 UTC; 1h 34min ago
Docs: https://containerd.io
Main PID: 3022 (containerd)
and the OS version
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
I verified the launch template was using AMI ami-097b4903ba6f2b624
which is the latest EKS 1.23 AL2 AMI in us-east-2
aws ssm get-parameter --name /aws/service/eks/optimized-ami/1.23/amazon-linux-2/recommended/image_id --query "Parameter.Value" --output text
ami-097b4903ba6f2b624
and ip_forwarding is set correctly
cat /proc/sys/net/ipv4/ip_forward
1
I'm going to try again with a 1.22 cluster and see if I get the same results.
Verified ip_forward was set on a 1.22 cluster using AMI ami-09ae6038e08d7e8ba
which is the latest 1.22 AMI in us-east-2.
Verified on a 1.21 cluster with ami-021b765d61a4b649f
(latest 1.21 AL2 AMI) in us-east-1 and ip_forward was set with containerd running.
It's possible eksctl is doing something extra with the node groups but I'd have to dig into it. If you have an AMI ID and region I can test with that would be helpful to verify.
It's possible eksctl is doing something extra with the node groups but I'd have to dig into it.
If it is, it would be in the user data most likely.
We had similar issue. Environment: OS: AmazonLinux OS Version: 2 EKS Version: 1.21 and 1.22
We implemented STIG config for ip_forward using -
#Set OS to not perform packet forwarding unless system is a router, V-204625
function V204625() {
local Regex1="^(\s*)#net.ipv4.ip_forward\s+\S+(\s*#.*)?\s*$"
local Regex2="s/^(\s*)#net.ipv4.ip_forward\s+\S+(\s*#.*)?\s*$/\net.ipv4.ip_forward = 0\2/"
local Regex3="^(\s*)net.ipv4.ip_forward\s+\S+(\s*#.*)?\s*$"
local Regex4="s/^(\s*)net.ipv4.ip_forward\s+\S+(\s*#.*)?\s*$/\net.ipv4.ip_forward = 0\2/"
local Regex5="^(\s*)net.ipv4.ip_forward\s*=\s*0?\s*$"
local Success="Set system to not perform package forwarding, per V-204625."
local Failure="Failed to set the system to not perform package forwarding, not in compliance V-204625."
echo
( (grep -E -q "${Regex1}" /etc/sysctl.conf && sed -ri "${Regex2}" /etc/sysctl.conf) || (grep -E -q "${Regex3}" /etc/sysctl.conf && sed -ri "${Regex4}" /etc/sysctl.conf)) || echo "net.ipv4.ip_forward = 0" >>/etc/sysctl.conf
(grep -E -q "${Regex5}" /etc/sysctl.conf && echo "${Success}") || {
echo "${Failure}"
exit 1
}
}
With this we had - docker mode
# cat /proc/sys/net/ipv4/ip_forward
1
containerd mode
# cat /proc/sys/net/ipv4/ip_forward
0
Resolution: We removed ip_forward (V-204625) implementation. Result: docker mode
# cat /proc/sys/net/ipv4/ip_forward
1
containerd mode
# cat /proc/sys/net/ipv4/ip_forward
1
For cis-benchmark.sh file you can comment these lines: https://github.com/aws-samples/amazon-eks-custom-amis/blob/main/scripts/cis-benchmark.sh#L329-L331
Thanks for the extra info @kalgopa
My assumption is the systems that are having this issue are using custom built AMIs which are not performing the necessary steps to enable ip_forwarding with non-docker container runtimes even if they're based on the EKS provided AMIs. If anyone on this ticket has an Amazon published AMI that's experiencing this problems or terraform code to create a cluster please let me know so I can reproduce the problem.
closing for now - @kalgopa please let us know if there is additional info and we can take another look
What happened:
With EKS 1.21 AMI if you want to use containerd option, it will fail because with this option
sysctl_entry "net.ipv4.ip_forward = 0
all the deployed pods will not have access to the network.With docker there isn't any problem.
Environment: