argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.85k stars 5.45k forks source link

trouble using --aws-role-arn option when adding EKS cluster with argocd CLI #2347

Closed jeremyhermann closed 1 year ago

jeremyhermann commented 5 years ago

I am trying to use the --aws-role-arn option when adding an EKS cluster to ArgoCD, as described in https://github.com/argoproj/argo-cd/issues/1304. I have not been able to get it to work and the error messages are difficult to interpret and I am not sure how to debug.

$ argocd cluster add acme-production --aws-cluster-name arn:aws:eks:us-west-2:<account-number>:cluster/acme-production --aws-role-arn arn:aws:iam::<account-number>:role/acme-production-deploy-role

FATA[0000] rpc error: code = Unknown desc = REST config invalid: Get https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com/version?timeout=32s: getting credentials: exec: exit status 1 

Note that I am able to successfully add the cluster using argocd cluster add https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com

Thanks

mdellavedova commented 1 year ago

@mmerickel thanks for your help, I have changed the setup to what you suggested secret:

apiVersion: v1
kind: Secret
metadata:
  name: secret-name
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: "cluster name shown in argocd"
  server: "https://<target-eks-control-plane-endpoint>.gr7.eu-central-1.eks.amazonaws.com"
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "<target-cluster-name>"
      },
      "tlsClientConfig": {
        "caData": "<CA-data-of-the-target-cluster>" }        
    }

and checked the IAM role: (trust relationship, no policies attached)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<AWS-account-number>:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:sub": [
                        "system:serviceaccount:argocd:argocd-server",
                        "system:serviceaccount:argocd:argocd-application-controller"
                    ],
                    "oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

annotated the service accounts, and updated the aws-auth configmap to point to the role above... same error

could it be a bug with the version of argocd I'm using? or perhaps something to do with the EKS version?

mmerickel commented 1 year ago

First, confirm the cluster is showing up in argocd's UI. Probably in a disconnected state. Then at least you know the secret is being read correctly. Then look at the argocd-server logs and see if you can find some useful error messages. I think you might be able to click invalidate cache on the cluster to force it to reconnect and see what logs it outputs while it tries to do that.

blakepettersson commented 1 year ago

As an addition to what @mmerickel said, I'd also take a look in Cloudtrail and check for anything weird going on with IAM there (e.g if the role cannot be assumed for whatever reason)

blakepettersson commented 1 year ago

Closing this issue for now since this is something which has been working for some time (both cross-account and cross-region), and with #14187 there's documentation on how to configure it.

sidewinder12s commented 1 year ago

I've been using my comments here plus the linked comment/instructions to manage cross-account cross-region clusters without issues for a while now. All of the configs look like:

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: test-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
data:
  name: "cluster name shown in argocd",
  server: "https://eks-control-plane-endpoint",
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "<actual-name-of-eks-cluster>"
      },
      "tlsClientConfig": {
        "caData": "the base64 ca data string"
      }
    }

This is all you need assuming:

  • You setup an IAM role in the AWS account where argocd lives.
  • You granted the argocd server and controller IRSA access to that role.
  • You granted that role system:masters access in the target cluster you're trying to let argocd control.

Just checking, @mmerickel your saying you are not using cross-account Roles in each EKS cluster account and instead just allowing your main ArgoCD IAM Role in each clusters aws-auth configmap so that ArgoCD can just directly request creds to the cluster? Reading through the cross-account setup instructions that have landed, I had started questioning why we need the cross-account role when the aws-auth configmap just allows X IAM role access with seemingly no regard to what account it is from.

mmerickel commented 1 year ago

That's correct, it's how I'm doing it right now. So in my example the IRSA role given to ArgoCD's server is the same one that the cross-account clusters are granting access to, such that there is no extra assume role hops required. So the proposal is:

argocd -> irsa role A -> remote cluster B
                      -> remote cluster C

There are some theoretical advantages to making multiple IAM roles but I didn't think it was worthwhile for me. If you do want an extra IAM role in the middle then you'll need to specify that role in the awsAuthConfig, and grant the irsa role the ability to assume that target role. Then you're doing something like below:

argocd -> irsa role A -> assume role B -> remote cluster B
                      -> assume role C -> remote cluster C

I'll leave you to decide which makes more sense for your org.

akefirad commented 9 months ago

I managed to glue everything mainly based on what is shared here, except for adding argocd-manager role in the main cluster (or argocd-deployer in the child cluster, if you wanna use a second role). I couldn't make it work with manually adding the role to aws-auth of the child cluster (to add it to system:masters group), not sure what the issue was, maybe username? (Setting it to the role ARN didn't help. In my case, the role had path, so that might have messed something up 🤷) Instead I used the new feature of EKS; access entry. Just add the argocd-manager role as a cluster admin to the child cluster (it doesn't matter that it's in the main cluster; i.e. it works with cross-account roles). Update: It was indeed the path messing it up https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/268

markandrus commented 8 months ago

@akefirad would you mind sharing how you set up the IAM Access Entry? Does the username matter?

akefirad commented 8 months ago

@markandrus which one, IAM Access entry or IRSA entry? The former doesn't need a username. This is our CDKTF code:

      const argocdEntry = new EksAccessEntry(this, "cluster-argocd-access-entry", {
        clusterName: clusterName,
        principalArn: argocdRoleArn, // including the path.
        // kubernetesGroups: ["admin"], // What is this? Why doesn't it working?
      });

      new EksAccessPolicyAssociation(this, "cluster-argocd-access-policy-association", {
        clusterName: clusterName,
        policyArn: "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy",
        principalArn: argocdRoleArn,
        accessScope: { type: "cluster" },
        dependsOn: [argocdEntry],
      });

(I finally managed to make it working with IRSA too, but in my case the issue was that our role has a path, and when setting the role in was-auth configmap, you should drop the path. See the bug reported above.)

Does that help?

fideloper commented 2 weeks ago

Can we accomplish this without using the aws-auth ConfigMap now a days? I don't have a clear path to automating the update of that ConfigMap (i'm sure you can!? but that seems difficult), and EKS docs claims it's deprecated in favor of creating Access Entries.

However with using Access Entries and skipping the use of aws-auth I get error the server has asked for the client to provide credentials. (Actually I get that error even if I update aws-auth as well!)

akefirad commented 2 weeks ago

@fideloper I'm not sure, the issue could be some restriction with access entries. I remember there were some restriction around the feature.

fideloper commented 2 weeks ago

I got it working without aws-auth 🎉 (my issue happened to be using the wrong cluster name, which caused argocd-k8s-auth to generate a bad token).

Using Access Entries worked great, no aws-auth required.

Within Terraform, setting up an access entries per cluster looked like this:

(where this is done on each "spoke" cluster, which is an EKS cluster that ArgoCD will manage):

# Create an access entry on a "spoke" EKS cluster so that ArgoCD ("hub" cluster)'s assumed role
# has RBAC permissions to administrate the spoke cluster

resource "aws_eks_access_entry" "argocd_rbac" {
  cluster_name      = "spoke-cluster-name"
  principal_arn     = "arn-of-role-being-assumed-by-argocd"
  kubernetes_groups = []
  type              = "STANDARD"
}

resource "aws_eks_access_policy_association" "argocd_rbac" {
  access_scope {
    type = "cluster"
  }

  cluster_name = "spoke-cluster-name"

  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
  principal_arn = "arn-of-role-being-assumed-by-argocd"

  depends_on = [
    aws_eks_access_entry.argocd_rbac
  ]
}