Closed tabern closed 3 years ago
Do we know how these addons will be deployed? Will they be deployed as a deployment or a daemonset? Will we also be able to configure options in Coredns, like number of replicas (if a deployment) or coredns-specific options like autopathing?
@sparky005 to start, we are going to be on-boarding the add-ons as they are run by EKS today for every cluster. This means coreDNS as a deployment and kube-proxy as a daemonset. For specific options, you'll have to configure those yourself by editing the API object on the cluster. We'll evaluate specific options like autopathing for these add-ons in the future.
Awesome, thanks for the quick answer @tabern!
@tabern As part of this and for kube-proxy specifically, can you consider changing the metricsBindAddress
to default to metricsBindAddress: 0.0.0.0:10249
rather than 127.0.0.1:10249
? In order for prometheus to monitor kube-proxy, the metrics endpoint needs to be exposed. This is currently the only customization that we do to kube-proxy in our setup, and I'm sure I'm not the only one.
We're happy to announce that CoreDNS and kube-proxy are now supported as part of EKS add-ons. You can now manage all three core networking add-ons (coredns, kube-proxy, vpc cni) using the EKS APIs, including provisioning and updates.
More information
!! note !! add-ons support for CoreDNS and kube-proxy is only available on the latest platform version of EKS (for each support K8s version 1.18 or higher).
If your cluster is not already on the latest platform version, you can update to the next Kubernetes version or create a new cluster to use EKS add-ons for CoreDNS and kube-proxy. Alternatively, all EKS clusters will update in the coming quarter without requiring action to the latest platform versions.
When I tried deploying kube-proxy as a managed add-on through Terraform, I had the following error:
Error: unexpected EKS add-on (eks-test-eu:kube-proxy) state returned during creation: creation not successful (CREATE_FAILED): Errors:
│ Error 1: Code: AccessDenied / Message: clusterrolebindings.rbac.authorization.k8s.io "eks:kube-proxy" is forbidden: user "eks:addon-manager" (groups=["system:authenticated"]) is attempting to grant RBAC permissions not currently held:
│ {APIGroups:["discovery.k8s.io"], Resources:["endpointslices"], Verbs:["get"]}
If you hit that, make sure to update eks:addon-manager
clusterrole which should include following block of permissions:
apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
- get
Then run terraform apply
and the addon will deploy okey.
It would be great if someone from AWS provides info on who is responsible for eks:addon-manager
cluster role and whether my updates to it won't be overwritten by some reconcile process built into EKS control plane.
The same came up in AWS Ireland (eu-west-1) and AWS China Beijing (cn-north-1).
Hi @marcincuber, it is possible that the eks:addon-manager
(or any other eks:*
role for that matter) will be occasionally overwritten. It's not reconciled on any sort of fixed cadence right now.
Once the addon is installed successfully, it may not even need these permissions again unless something gets deleted and it has to re-create the ClusterRoleBinding. Another option for this particular case is removing get
permissions from endpointslices
in the system:node-proxier
role since they are not directly needed by kube-proxy (only list
and watch
are needed as seen by the current bootstrap code) and are not given to the role by default.
In the future we expect the permissions of eks:addon-manager
to expand so these sorts of issues eventually won't be a problem.
@cheeseandcereal Do you know whether there is a publicly available git repository where eks:addon-manager
clusterrole is being configured? I would be more than happy to create a PR to add missing permissions when necessary.
There isn't any public repository with that info at the moment, unfortunately. The way that role gets updated may change in the future.
EKS addon manager persistently overrides custom Corefile. Is that ok?
EKS addon manager persistently overrides custom Corefile. Is that ok?
I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?
EKS addon manager persistently overrides custom Corefile. Is that ok?
I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?
We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them. I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.
Upd: I see an issue related to this problem already exists #1275
EKS addon manager persistently overrides custom Corefile. Is that ok?
I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?
We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them. I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.
Upd: I see an issue related to this problem already exists #1275
I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.
EKS addon manager persistently overrides custom Corefile. Is that ok?
I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?
We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them. I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile. Upd: I see an issue related to this problem already exists #1275
I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.
Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.
I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.
This seems to be the only workaround currently and saved me from desaster.
Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.
By only removing the update/patch permission for the coredns ConfigMap the other add-ons should not be affected.
Is there a solution for this yet? I'm trying to add hosts entries on the coredns configmap and it constantly gets overwritten! Is there a way to get past this without editing the permission for eks:addon-manager role?
I am facing the same problem as @vishwas2f4u or @dcherman
I would like to adjust kube-proxy-config
ConfigMap to change some conntrack
settings but my changes are constantly overwritten by EKS addon manager
Hi, we'd like to add zone antiaffinity to the coredns affinity rules, e.g.:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- kube-dns
topologyKey: kubernetes.io/hostname
weight: 100
- podAffinityTerm: # to add this rule to the list
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- kube-dns
topologyKey: topology.kubernetes.io/zone
As I can see affinity is managed by eks
:
f:spec:
f:affinity:
f:nodeAffinity:
f:requiredDuringSchedulingIgnoredDuringExecution:
f:nodeSelectorTerms: {}
f:podAntiAffinity:
f:preferredDuringSchedulingIgnoredDuringExecution: {}
It is important for us to make sure we always have a working coredns instance in each AZ, we don't want to end up with all the pods on the same AZ.
EKS addon manager persistently overrides custom Corefile. Is that ok?
I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?
We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them. I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile. Upd: I see an issue related to this problem already exists #1275
I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.
Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.
Does this work properly? When I look at the eks:addon-manager
Role, seems like f:rules
is also managed server-side 🤔
Unfortunately you're right, it doesn't work reliably. The eks:addon-manager
role is occasionally updated by EKS and any manual changes made to it are overwritten.
Anyone coming across this should upvote issue https://github.com/aws/containers-roadmap/issues/1275 to help prioritise a fix for this.
@gitfool @dcherman @itssimon @nuriel77 @mdrobny Do you know any fix for that? Any changes in coredns configmaps is overwritten by eks coredns addon. Thanks!!!
Curious if any change in configMap
will be overriden by addon-manager, then how this post is done: https://aws.amazon.com/premiumsupport/knowledge-center/eks-conditional-forwarder-coredns/?nc1=h_ls
@yunfeilu-dev there is a note to only use this on self managed CoreDNS, but I suspect that if you get the field management configured correctly you could do this with a managed addon as long as you're not blocking fields it needs to change.
uhhhhh, thanks for answering my dumb question.
We use IaC (terraform) to manage create and bootstrap our clusters running on Outpost. We would like to have the possibility to add a forwarder to coredns config without the CoreDNS addon to overwrite it.
I wasn't smart enough to get the managed fields working with eks+coredns nor was I dedicated enough to get the official helm chart to work.
Instead what got me rolling past my issue was:
eksctl delete addon --cluster us --name coredns --preserve
This will delete the addon from eks addon manager, but keep all the manifests that it generated. Apply your new cm/deploy changes and restart the pods as usual
I think Amazon maybe should have released a custom Helm chart for CoreDNS and kube-proxy similar to the one they released for VPC-CNI; that way, people who just want AWS to manage everything for them could use the add-ons, but people who need to be able to customize e.g. their Corefile could do so by deploying via helm (without having to the use the upstream coredns chart, which doesn't line up with what Amazon deploys and cannot really be coerced to do so). Right now we're already using the VPC-CNI chart instead of the add-on so we can reliably customize the configuration, and we'd happily do the same for coredns if there was an available chart that actually worked.
I've been having good luck using the official coredns helm chart with the eks-optimized images in ECR.
@voidlily could you share the values you're using?
A fairly standard helm config should work, with the caveat that we want existing configuration in a fresh cluster to still work the same as when aws's coredns deployment/service was installed.
autoscaler:
enabled: true
image:
# https://docs.aws.amazon.com/eks/latest/userguide/managing-coredns.html
repository: "602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns"
tag: "v1.8.7-eksbuild.1"
service:
name: "kube-dns"
# default version in a fresh eks cluster hardcoded to x.x.x.10, reuse the ip to avoid reconfiguration
clusterIP: "10.10.0.10"
Then before running the helm chart, I run this shell script to remove aws's installation and annotate the existing kube-dns
service so helm can manage it
kubectl --namespace kube-system delete deployment coredns
kubectl --namespace kube-system annotate --overwrite service kube.dns meta.helm.sh/release-name=coredns
kubectl --namespace kube-system annotate --overwrite service kube-dns meta.helm.sh/release-namespace=kube-system
kubectl --namespace kube-system label --overwrite service kube-dns app.kubernetes.io/managed-by=Helm
I have a similar helm import script I use when dealing with vpc-cni's helm chart to avoid connectivity interruptions
NAMESPACE="kube-system"
for kind in daemonSet clusterRole clusterRoleBinding serviceAccount; do
if kubectl --namespace $NAMESPACE get --ignore-not-found $kind/aws-node | grep aws-node; then
echo "setting annotations and labels on $kind/aws-node"
kubectl --namespace $NAMESPACE annotate --overwrite $kind aws-node meta.helm.sh/release-name=aws-vpc-cni
kubectl --namespace $NAMESPACE annotate --overwrite $kind aws-node meta.helm.sh/release-namespace=kube-system
kubectl --namespace $NAMESPACE label --overwrite $kind aws-node app.kubernetes.io/managed-by=Helm
else
echo "skipping $kind/aws-node as it does not exist in $NAMESPACE"
fi
done
@voidlily You don't run into problems with the fact that the ClusterRole used by your new deployment from the chart has fewer permissions that the one normally used by the EKS builds of CoreDNS? EKS's system:coredns
ClusterRole grants get
on nodes
, but the Chart one doesn't give that permissions.
Mind, it's unclear to whether the EKS role grants that permission because they've customized coredns
such that it needs that permission, or if it is just something left over from prior versions, or what.
I can't say I've run into issues with the clusterrole differences personally, no
I ended up digging into our EKS audit logs in cloudwatch and I can't find a single occurrence where system:serviceaccount:kube-system:coredns
has been used to access any resource type other than endpoints
, namespaces
, and services
(presuambly pods
would have shown up if I had pod verification turned on).
Then I got even more curious and dug into the coredns
source code. Here is the only place I could find where CoreDNS tries to query node information:
// GetNodeByName return the node by name. If nothing is found an error is
// returned. This query causes a roundtrip to the k8s API server, so use
// sparingly. Currently this is only used for Federation.
func (dns *dnsControl) GetNodeByName(ctx context.Context, name string) (*api.Node, error) {
v1node, err := dns.client.CoreV1().Nodes().Get(ctx, name, meta.GetOptions{})
return v1node, err
}
So I think it's safe to lose. Kubernetes federation isn't (as far as I know) a GA feature, and if it ever becomes one presumably the official CoreDNS chart will add those permissions back in.
Add support for managing CoreDNS and kube-proxy with EKS add-ons.
Roadmap feature from #252