crossplane-contrib / provider-upjet-aws

Official AWS Provider for Crossplane by Upbound.
https://marketplace.upbound.io/providers/upbound/provider-aws
Apache License 2.0
137 stars 112 forks source link

[Bug]: The IAM Role resource uses role names as external ID causing conflict between role using the same name but different paths #1376

Open gadiener opened 1 week ago

gadiener commented 1 week ago

Is there an existing issue for this?

Affected Resource(s)

Resource MRs required to reproduce the bug

apiVersion: iam.aws.upbound.io/v1beta1
kind: Role
metadata:
  labels:
    crossplane.io/composite: my-awesome-role
    type: service-account-role
  annotations:
    crossplane.io/external-name: demo-role
  name: demo-role
spec:
  deletionPolicy: Delete
  forProvider:
    assumeRolePolicy: '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Action":["sts:AssumeRoleWithWebIdentity"],"Principal":{"Federated":"test"},"Condition":{"StringLike":{"test:aud":"sts.amazonaws.com","test:sub":"system:serviceaccount:demo-ns:demo-sa"}}}]}'
    description: demo-role used by service-account demo-ns:demo-sa
    forceDetachPolicies: true
    path: /my-path/demo-ns/
    permissionsBoundary: arn:aws:iam::test:policy/edge-case/my-path/boundary.assume-role
    tags:
      Stage: test
  providerConfigRef:
    name: test-test-iamrole-provider-config
apiVersion: iam.aws.upbound.io/v1beta1
kind: Role
metadata:
  labels:
    crossplane.io/composite: my-awesome-role
    type: service-account-role
  annotations:
    crossplane.io/external-name: demo-role
  name: demo-role2
spec:
  deletionPolicy: Delete
  forProvider:
    assumeRolePolicy: '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Action":["sts:AssumeRoleWithWebIdentity"],"Principal":{"Federated":"test"},"Condition":{"StringLike":{"test:aud":"sts.amazonaws.com","test:sub":"system:serviceaccount:demo-ns:demo-sa"}}}]}'
    description: demo-role used by service-account demo-ns:demo-sa
    forceDetachPolicies: true
    path: /my-path/demo-ns-2/
    permissionsBoundary: arn:aws:iam::test:policy/edge-case/my-path/boundary.assume-role
    tags:
      Stage: test
  providerConfigRef:
    name: test-test-iamrole-provider-config

note that the crossplane.io/external-name label value is the same for both resources while spec.forProvider.path have different values

Steps to Reproduce

Create 2 roles using the same name but different paths, the crossplane.io/external-name uses only the role name therefore the Crossplane it's not able to differentiate between the 2 cloud resources (should we use ARNs instead or role path + name?)

What happened?

It is not possible to create 2 roles with the same name but different role paths

Relevant Error Output Snippet

No response

Crossplane Version

v1.15.0

Provider Version

v1.1.1

Kubernetes Version

v1.28.9

Kubernetes Distribution

EKS

Additional Info

I'd love to contribute with a fix but I'd need someone to guide me through

mbbush commented 2 days ago

Why is it a problem for multiple managed resources to have the same external name? I'm surprised to see the external name present as a label. I'd expect it to be an annotation.

Changing the external name configuration of this resource, which is quite possibly the single most commonly used resource in the provider, will require substantial testing around how it behaves in various cases, including before/during/after installing an upgraded provider.

I've never tried using multiple roles with the same name and different path. Given that the terraform id is the same, it's not a given that would work properly even without the label collision.

gadiener commented 1 day ago

Hi @mbbush!

You are right about the external name, it should be an annotation. I fixed the examples in the description.

I agree that's a challenging change.

Let me explain our use case: In our platform based on Kubernetes we allow user to create AWS roles using a composition claim we provide. The composition takes the claim namespace and the claim name and compose the IAM role path and name following a standard similar to /<namespace>/<role-name>. The role is then attached to a service account using the IRSA annotation.

That allows users to create unique roles in different namespaces, as the namespace is included in the role path making it unique.

Everything works smoothly when creating the first role:

  1. Claim for role-a is created in namespace namespace-y
  2. Composition creates Role resource with name role-a-<unique-suffix-based-on-namespace-y> (we use the suffix to make the resource unique as it is not namespaced) and external name annotation role-a and role path set in the spec
  3. IAM Role is created in AWS
  4. Everyone is happy

When the second role is created:

  1. Claim for role-a is created in namespace namespace-z
  2. Composition creates Role resource with name role-a-<unique-suffix-based-on-namespace-z> and external name annotation role-a and role path set in the spec (I believe the issue is here as neither the path nor the arn is used in the external name annotation)
  3. Role resource fails the sync with an error similar to resource already exists as a resource with role name role-a was already created from the namespace-y

We also considered a different approach putting the namespaces as prefix in the role name. The issue that we faced was that AWS have a 64 character limit on the role name and we reached this limit in some scenario where we had long namespace names.

Hope that helps in understanding what we are trying to do. Let me know what you think. It feels wrong to use an ID in the provider that cannot uniquely identify the related cloud resource. Meanwhile I'll open an issue in the Terraform provider to brainstorm ideas also from there.