crossplane-contrib / provider-upjet-aws

AWS Provider for Crossplane.
https://marketplace.upbound.io/providers/upbound/provider-family-aws/
Apache License 2.0
147 stars 124 forks source link

[Bug]: Lack of region leads to incorrect STS API call for IRSA credentials. #1308

Open Dennor opened 6 months ago

Dennor commented 6 months ago

Is there an existing issue for this?

Affected Resource(s)

Resource MRs required to reproduce the bug

apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
  name: enable-aws-pod-identity
spec:
  serviceAccountTemplate:
    metadata:
      annotations:
        eks.amazonaws.com/role-arn: arn::some::oidc-provider
---
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: aws-iam
spec:
  package: xpkg.upbound.io/upbound/provider-aws-iam:v1.4.0
  runtimeConfigRef:
    name: enable-aws-pod-identity
---
apiVersion: iam.aws.upbound.io/v1beta1
kind: Role
metadata:
  name: some-role
spec:
  forProvider:
    assumeRolePolicy: |
      ...snip
    inlinePolicy:
    - name: some-policy
      policy: |
        ...snip

Steps to Reproduce

What happened?

I've expected the provider to authenticate with STS endpoint like others do. Unfortunately due to the lack of region provider attempts to call STS endpoint without region and fails. It attempts to call sts..amazonaws.com which is clearly wrong.

Relevant Error Output Snippet

Warning  CannotConnectToProvider       9m34s (x29 over 32m)    managed/iam.aws.upbound.io/v1beta1, kind=role  cannot initialize the Terraform plugin SDK async external client: cannot get terraform setup: cache manager failure: cannot retrieve the AWS account ID: GetCallerIdentity query failed: operation error STS: GetCallerIdentity, get identity: get credentials: failed to refresh cached credentials, failed to retrieve credentials, operation error STS: AssumeRoleWithWebIdentity, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts..amazonaws.com/": dial tcp: lookup sts..amazonaws.com: no such host

Crossplane Version

1.15.2

Provider Version

1.4.0

Kubernetes Version

1.29.4

Kubernetes Distribution

k0s

Additional Info

A simple addition of

  deploymentTemplate:
    spec:
      selector: {}
      template:
        spec:
          containers:
          - name: package-runtime
            env:
            - name: AWS_REGION
              value: eu-central-1

in the DeploymentRuntimeConfig fixes the issue.

erhancagirici commented 5 months ago

hi @Dennor, thanks for reporting this. Could you also provide the YAML output of the provider-aws-iam pod?

IRSA provider config implementation assumes that the provider pod runs on an EKS cluster. EKS injects several extra environment variables on IRSA-enabled pods, like AWS_REGION , AWS_DEFAULT_REGION and AWS_STS_REGIONAL_ENDPOINTS which influence the resulting AWS SDK configuration. See https://github.com/aws/amazon-eks-pod-identity-webhook?tab=readme-ov-file#aws_default_region-injection for reference

For the kubernetes distribution you use, I am not sure how IRSA-related configuration is injected, e.g. how eks.amazonaws.com/role-arn annotation is handled. I assume that these are not automatically injected by your distribution. Could you specify a bit more about how your environment looks like?

aiell0 commented 3 months ago

@erhancagirici look at this closed issue for some more context: https://github.com/crossplane-contrib/provider-upjet-aws/issues/1252

This is still not solved IMO. Ideally we could have something like:

apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
  name: crossplane-provider-config
  namespace: kube-system
spec:
  credentials:
    source: PodIdentity
haarchri commented 3 months ago

checkout this: https://github.com/crossplane-contrib/provider-upjet-aws/pull/1459

github-actions[bot] commented 1 week ago

This provider repo does not have enough maintainers to address every issue. Since there has been no activity in the last 90 days it is now marked as stale. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.