aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.17k stars 313 forks source link

[EKS] [eks-pod-identity] [bug]: Setting the STS Session name in eks-pod-identity-agent #2362

Open taer opened 4 weeks ago

taer commented 4 weeks ago

I think this is probably more of a bug report, but I cannot find the proper channels. Version Info: EKS: 1.29 Eks Pod ID agent: v1.2.0-eksbuild.1

We are using eks-pod-identity. It's been working great till we started using IAM based kafka.

I directly hit the eks-pod-identity pod inside a container agent via

       AWS_CONTAINER_CREDENTIALS_FULL_URI:      http://169.254.170.23/v1/credentials                                                                                                        
       AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE:  /var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token                                                               │

I took the resulting AccessKey, SecertKey, and Token, and used them to do a aws sts get-caller-identity

{
    "UserId": "AROAXYZP66II6MSBPLOUU:eks-k8s-wl-dev-engine-boo-5af5e7ac-5754-49ea-b28f-2c2a2eb95fbb",
    "Account": "BLAH",
    "Arn": "arn:aws:sts::BLAH:assumed-role/dev-use1-default-bookingApiPod-20240529171038526700000001/eks-k8s-wl-dev-engine-boo-5af5e7ac-5754-49ea-b28f-2c2a2eb95fbb"
}

The SessionName is non static, and I can't find any way to force set it. The issue comes from the MSK IAM usage. MSK doesn't allow "reauthentication". We get this error when the session name changes

failed authentication due to: Cannot change principals during re-authentication from IAM.arn:aws:sts::BLAH:assumed-role/prd-use1-default-bookingApiPod-20240529174141238800000002/eks-k8s-wl-prd-engine-boo-064f1ed1-2349-4774-b895-9a69ccc3eeb1: IAM.arn:aws:sts::BLAH:assumed-role/prd-use1-default-bookingApiPod-20240529174141238800000002/eks-k8s-wl-prd-engine-boo-875fdfcd-19cd-4ac3-8544-7077f94a6e39

Most services we've used IAM for to date don't care. The solution normally would be to set AWS_ROLE_SESSION_NAME when calling STS:AssumeRole. But we're not calling that, the eks-pod-identity pod is. The ContainerCredentialProvider in the SDK is just calling the AWS_CONTAINER_CREDENTIALS_FULL_URI URL with the contents of AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE as the Authorization header. There is no option as far as I've found to have that include a constant session name to satisfy MSK.

Thanks!

prateekgogia commented 3 weeks ago

I directly hit the eks-pod-identity pod inside a container agent via

IIUC, you are manually calling the AWS_CONTAINER_CREDENTIALS_FULL_URI with the token from inside a pod running on your node and use the credentials to make this aws sts get-caller-identity cli call.

Trying to understand if these credentials ever work from inside a Pod when an application is trying to reach Kafka and fail when using these credentials manual with aws cli? or they never work with Kafka?

taer commented 3 weeks ago

Yeah. basically, I was doing what the SDK does manually just to validate.

The credentials given by the pod-identity work perfectly w/ Kafka until they reach their expiration. When the SDK goes back to the pod-identity-agent near the expiration, the agent refreshes the tokens for it. The issue though is it refreshes it with a new sessionName. And the MSK IAM auth scheme hates that. It considers the session-name change to be a change in principal, and screams failed authentication due to: Cannot change principals during re-authentication from IAM.arn:aws:sts::BLAH:assumed-role/prd-use1-default-bookingApiPod-20240529174141238800000002/eks-k8s-wl-prd-engine-boo-064f1ed1-2349-4774-b895-9a69ccc3eeb1: IAM.arn:aws:sts::BLAH:assumed-role/prd-use1-default-bookingApiPod-20240529174141238800000002/eks-k8s-wl-prd-engine-boo-875fdfcd-19cd-4ac3-8544-7077f94a6e39

important parts in there is the change here

arn:aws:sts......roleName/eks-k8s-wl-prd-engine-boo-064f1ed1-2349-4774-b895-9a69ccc3eeb1
arn:aws:sts......roleName/eks-k8s-wl-prd-engine-boo-875fdfcd-19cd-4ac3-8544-7077f94a6e39

That sessionName of pod-name-uuid is generated by the eks-pod-identity-agent. Ideally, we could configure the agent to just use the pod-name and not include the uuid, and we'd be golden.

dims commented 1 week ago

fyi code is here now - https://github.com/aws/eks-pod-identity-agent - can we please move this to an issue there? 🙏🏾