admiraltyio / admiralty

A system of Kubernetes controllers that intelligently schedules workloads across clusters.
https://admiralty.io
Apache License 2.0
683 stars 86 forks source link

Enable logs/exec for EKS 1.19+ #120

Open adrienjt opened 3 years ago

adrienjt commented 3 years ago

EKS stopped signing node server certs with the certificates.k8s.io/v1beta1 API, but would sign them with the v1? (Other distributions,e.g., AKS and GKE, continue to sign node server certs with the certificates.k8s.io/v1beta1 API until 1.21 included, so EKS's behavior is surprising.)

You can continue to request that a CSR to is signed for a non-node server cert, webhooks (for example, with the certificates.k8s.io/v1beta1 API). [sic]

https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-1.19

adrienjt commented 2 years ago

fixed by #139

adrienjt commented 2 years ago

This is more complicated than I thought.

By design EKS does not issue certificates for CSRs with signerName "kubernetes.io/kubelet-serving" unless the CSR was actually requested by a kubelet. EKS's custom signer validates this by checking that the requested SANs for CSRs with signerName kubernetes.io/kubelet-serving match an actual EC2 instance's IPs/DNS names. In other words, EKS does not issue certificates for CSRs with signerName kubernetes.io/kubelet-serving posing as kubelets, it only issues certificates for CSRs with signerName kubernetes.io/kubelet-serving for actual kubelets.

https://github.com/aws/containers-roadmap/issues/1604#issuecomment-1072660824

We must use that signer because this is the only one trusted by the API server when it calls logs/exec endpoints.

Currently, Admiralty creates and self-approves a CSR for the Admiralty controller-manager/agent pod IP, and the Node object representing each target specifies that IP as their address. The pod exposes logs/exec endpoints at port 10250, the default kubelet port.

Here's a trick I thought of: we could expose those endpoints with hostPort at a port different than the default (because the default is already taken by the kubelet of the node hosting the Admiralty pod), change the virtual node objects to use the Admiralty pod's hosting node IP as their address (status.addresses) and the chosen non-default port (status.daemonEndpoints.kubeletEndpoint.Port). The EKS control plane should sign the CSR because the IP is that of an actual EC2 instance. We'd want to make sure that security groups allow traffic from the API server to the virtual kubelet port.

Let's first confirm whether the EKS control plane would sign such a CSR...

cc @fakeburst

fakeburst commented 2 years ago

@adrienjt I'll try this approach in my environment

flowinh2o commented 5 months ago

Any update on this? I am trying to get an eks cluster version 1.29 and seeing the pod logs/exec error in the controller logs.

main.go:329] timed out waiting for virtual kubelet serving certificate to be signed, pod logs/exec won't be supported

I am running the Admiralty helm chart version 0.16.0. Thank you