Closed jeremyhermann closed 1 year ago
@mmerickel thanks for your help, I have changed the setup to what you suggested secret:
apiVersion: v1
kind: Secret
metadata:
name: secret-name
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: "cluster name shown in argocd"
server: "https://<target-eks-control-plane-endpoint>.gr7.eu-central-1.eks.amazonaws.com"
config: |
{
"awsAuthConfig": {
"clusterName": "<target-cluster-name>"
},
"tlsClientConfig": {
"caData": "<CA-data-of-the-target-cluster>" }
}
and checked the IAM role: (trust relationship, no policies attached)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<AWS-account-number>:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:sub": [
"system:serviceaccount:argocd:argocd-server",
"system:serviceaccount:argocd:argocd-application-controller"
],
"oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:aud": "sts.amazonaws.com"
}
}
}
]
}
annotated the service accounts, and updated the aws-auth
configmap to point to the role above... same error
could it be a bug with the version of argocd I'm using? or perhaps something to do with the EKS version?
First, confirm the cluster is showing up in argocd's UI. Probably in a disconnected state. Then at least you know the secret is being read correctly. Then look at the argocd-server logs and see if you can find some useful error messages. I think you might be able to click invalidate cache on the cluster to force it to reconnect and see what logs it outputs while it tries to do that.
As an addition to what @mmerickel said, I'd also take a look in Cloudtrail and check for anything weird going on with IAM there (e.g if the role cannot be assumed for whatever reason)
Closing this issue for now since this is something which has been working for some time (both cross-account and cross-region), and with #14187 there's documentation on how to configure it.
I've been using my comments here plus the linked comment/instructions to manage cross-account cross-region clusters without issues for a while now. All of the configs look like:
apiVersion: v1 kind: Secret type: Opaque metadata: name: test-cluster namespace: argocd labels: argocd.argoproj.io/secret-type: cluster data: name: "cluster name shown in argocd", server: "https://eks-control-plane-endpoint", config: | { "awsAuthConfig": { "clusterName": "<actual-name-of-eks-cluster>" }, "tlsClientConfig": { "caData": "the base64 ca data string" } }
This is all you need assuming:
- You setup an IAM role in the AWS account where argocd lives.
- You granted the argocd server and controller IRSA access to that role.
- You granted that role
system:masters
access in the target cluster you're trying to let argocd control.
Just checking, @mmerickel your saying you are not using cross-account Roles in each EKS cluster account and instead just allowing your main ArgoCD IAM Role in each clusters aws-auth configmap so that ArgoCD can just directly request creds to the cluster? Reading through the cross-account setup instructions that have landed, I had started questioning why we need the cross-account role when the aws-auth configmap just allows X IAM role access with seemingly no regard to what account it is from.
That's correct, it's how I'm doing it right now. So in my example the IRSA role given to ArgoCD's server is the same one that the cross-account clusters are granting access to, such that there is no extra assume role hops required. So the proposal is:
argocd -> irsa role A -> remote cluster B
-> remote cluster C
There are some theoretical advantages to making multiple IAM roles but I didn't think it was worthwhile for me. If you do want an extra IAM role in the middle then you'll need to specify that role in the awsAuthConfig
, and grant the irsa role the ability to assume that target role. Then you're doing something like below:
argocd -> irsa role A -> assume role B -> remote cluster B
-> assume role C -> remote cluster C
I'll leave you to decide which makes more sense for your org.
I managed to glue everything mainly based on what is shared here, except for adding argocd-manager role in the main cluster (or argocd-deployer in the child cluster, if you wanna use a second role). I couldn't make it work with manually adding the role to aws-auth
of the child cluster (to add it to system:masters
group), not sure what the issue was, maybe username
? (Setting it to the role ARN didn't help. In my case, the role had path, so that might have messed something up 🤷)
Instead I used the new feature of EKS; access entry. Just add the argocd-manager role as a cluster admin to the child cluster (it doesn't matter that it's in the main cluster; i.e. it works with cross-account roles).
Update: It was indeed the path messing it up https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/268
@akefirad would you mind sharing how you set up the IAM Access Entry? Does the username matter?
@markandrus which one, IAM Access entry or IRSA entry? The former doesn't need a username. This is our CDKTF code:
const argocdEntry = new EksAccessEntry(this, "cluster-argocd-access-entry", {
clusterName: clusterName,
principalArn: argocdRoleArn, // including the path.
// kubernetesGroups: ["admin"], // What is this? Why doesn't it working?
});
new EksAccessPolicyAssociation(this, "cluster-argocd-access-policy-association", {
clusterName: clusterName,
policyArn: "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy",
principalArn: argocdRoleArn,
accessScope: { type: "cluster" },
dependsOn: [argocdEntry],
});
(I finally managed to make it working with IRSA too, but in my case the issue was that our role has a path, and when setting the role in was-auth configmap, you should drop the path. See the bug reported above.)
Does that help?
Can we accomplish this without using the aws-auth
ConfigMap now a days? I don't have a clear path to automating the update of that ConfigMap (i'm sure you can!? but that seems difficult), and EKS docs claims it's deprecated in favor of creating Access Entries.
However with using Access Entries and skipping the use of aws-auth
I get error the server has asked for the client to provide credentials
. (Actually I get that error even if I update aws-auth
as well!)
@fideloper I'm not sure, the issue could be some restriction with access entries. I remember there were some restriction around the feature.
I got it working without aws-auth
🎉 (my issue happened to be using the wrong cluster name, which caused argocd-k8s-auth
to generate a bad token).
Using Access Entries worked great, no aws-auth
required.
Within Terraform, setting up an access entries per cluster looked like this:
(where this is done on each "spoke" cluster, which is an EKS cluster that ArgoCD will manage):
# Create an access entry on a "spoke" EKS cluster so that ArgoCD ("hub" cluster)'s assumed role
# has RBAC permissions to administrate the spoke cluster
resource "aws_eks_access_entry" "argocd_rbac" {
cluster_name = "spoke-cluster-name"
principal_arn = "arn-of-role-being-assumed-by-argocd"
kubernetes_groups = []
type = "STANDARD"
}
resource "aws_eks_access_policy_association" "argocd_rbac" {
access_scope {
type = "cluster"
}
cluster_name = "spoke-cluster-name"
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
principal_arn = "arn-of-role-being-assumed-by-argocd"
depends_on = [
aws_eks_access_entry.argocd_rbac
]
}
I am trying to use the
--aws-role-arn
option when adding an EKS cluster to ArgoCD, as described in https://github.com/argoproj/argo-cd/issues/1304. I have not been able to get it to work and the error messages are difficult to interpret and I am not sure how to debug.I have ArgoCD running in one AWS account and my EKS cluster is in another AWS account
I have set up the
acme-production-deploy-role
so that it can be assumed both by the AWS role that I am using to runargocd cluster add ...
and by the EC2 instances in my ArgoCD cluster (I am confused about which IAM identity is used to assume the role so I tried to allow both to work).Here is what I see when I try to add the cluster. (I have redacted the AWS account numbers and the EKS id, but confirmed that I used the correct values for these):
Note that I am able to successfully add the cluster using
argocd cluster add https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com
Thanks