Open volter1337 opened 5 years ago
We're always open for PR's 🙂. Usually we use ALIAS because of ALB's which is really convenient. So it's not allowed even if you would use an ALB?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
I have a question regarding this. We have a K8s cluster setup in GovCloud, and obviously you can't use GovCloud Route53 to route external so we have our Route53 setup on our commercial account to point back to the K8s cluster. Will ExternalDNS work with this setup, or is this what we are lobbying to get implemented?
Thanks in advance for the help.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale /kind feature
I don't think this would be possible because you cannot delegate access between GovCloud and standard AWS accounts (see restrictions here: https://docs.aws.amazon.com/govcloud-us/latest/UserGuide/govcloud-iam.html) - at least not the way external-dns is currently implemented...
The role you provide here: https://github.com/kubernetes-sigs/external-dns/blob/84e6002297de10485456e0fa400379d3b2a972f7/provider/aws/aws.go#L176 wouldn't be able to programmatically access standard Route53 resources. Put another way, you can't specify standard AWS ARNs in GovCloud policies and vice versa.
Upon further research, another way to do this is to provide access key and secrets directly to external-dns, and use https://github.com/kubernetes-sigs/external-dns/blob/84e6002297de10485456e0fa400379d3b2a972f7/provider/aws/aws.go#L135 to not create ALIAS records.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Here's my workaround(s) to get this working.
On my govcloud EKS cluster, I need it to talk to a R53 zone (internal one, of course) within govcloud. I have to a an env var of AWS_REGION=us-gov-west-1 (otherwise it won't connect to the R53 endpoint), and also add an arg of --aws-prefer-cname (otherwise it tries to do aliases, which govcloud R53 does not support), and --txt-prefix=prefix- (otherwise it tries to add a TXT record with the name name as the CNAME, which will also fail). Otherwise, IRSA works fine.
It also needs to talk to an external DNS zone in a commercial account... So I ran a second deployment of the app and just did AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with a secret... It's the only possible way...
The only quirk I see is that the my 2nd deployment of the app (the one that has the key and secret to the commercial account), still seems to "want" to see and modify configs for my internal zone, even when I have --domain-filter specified... but it doesn't error out... and both deployments work fine doing their own thing.
+1
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
I was able to get external-dns to assume a role in the commercial account using OIDC AssumeRoleWithWebIdentity. I think I'm hitting a bug though. The external-DNS service fails to retrieve zones because it is trying to assume the role it already has and fails to do that. External-DNS logs show the following:
time="2021-08-10T14:02:29Z" level=info msg="Instantiating new Kubernetes client" time="2021-08-10T14:02:29Z" level=debug msg="apiServerURL: " time="2021-08-10T14:02:29Z" level=debug msg="kubeConfig: " time="2021-08-10T14:02:29Z" level=info msg="Using inCluster-config based on serviceaccount-token" time="2021-08-10T14:02:29Z" level=info msg="Created Kubernetes client https://172.20.0.1:443" time="2021-08-10T14:02:31Z" level=info msg="Assuming role: arn:aws:iam::111111111111:role/build-test-us-east-1-external-dns" time="2021-08-10T14:02:36Z" level=debug msg="Refreshing zones list cache" time="2021-08-10T14:02:37Z" level=error msg="records retrieval failed: failed to list hosted zones: AccessDenied: User: arn:aws:sts::111111111111:assumed-role/build-test-us-east-1-external-dns/1628604156229058731 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::111111111111:role/build-test-us-east-1-external-dns\n\tstatus code: 403, request id: 6e79719a-0a10-4f67-80e2-284cc6561717"
I've looked at the code a bit (not a go programmer, yet) and I think there may be an issue with the aws-sdk-go api or possibly the external-dns/provider/aws.go in the way it is handling the assume role functionality.
I think I may have to try another approach in the short term.
UPDATE: This now appears to be working. I added the AssumeRole permission to assumed role so the role could assume itself and that is now working. I hope to post a few more details when I get a chance.
Here are a few notes on getting this to work with EKS cluster in GovCloud and DNS in commercial account using OIDC
IN TARGET ACCOUNT Configure OIDC provider Use URL from source OIDC issuer
resource "aws_iam_openid_connect_provider" "commercial" {
provider = aws.commercial
url = module.eks.cluster_oidc_issuer_url
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [ data.tls_certificate.cluster.certificates.0.sha1_fingerprint ]
tags = merge(local.tags, {
Name = "${var.application}-${var.environment}-${var.comm_region}-external-dns",
Region = var.comm_region
})
}
create assumed role Assume Role Policy Trusts the target account OIDC Provider For external-dns Kubernetes service, the role needs permissions to do sts:AssumeRole on itself Add additional policy for any functional permissions you need (e.g. DNS manipulation).
Use the following assume role policy
data "aws_iam_policy_document" "eks_external_dns_assumerole_policy" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
principals {
type = "Federated"
identifiers = [ aws_iam_openid_connect_provider.commercial.arn ]
}
condition {
test = "StringEquals"
variable = "${trimprefix(module.eks.cluster_oidc_issuer_url,"https://")}:sub"
values = ["system:serviceaccount:kube-system:external-dns"]
}
sid = "externalDNS"
}
}
IN THE SOURCE (EKS) ACCOUNT The entity doing the assumption needs permission to assume the role (replace the arn with the assume role arn in the target account)
{
"Sid": "AssumeExternalDNS",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:AssociatedResourceArn": "arn:aws:iam::111111111111:role/build-test-us-east-1-external-dns"
}
}
}
@FixItDad much appreciated for the configuration notes, so far your post is the only evidence on the Internet, stating this solution does work for GovCloud. IMHO, this should be the part of the official documentation or an article.
However, reproducing the above configuration didn't help resolve the issue I continue having one of the following:
time="2021-09-20T07:43:38Z" level=error msg="records retrieval failed: failed to list hosted zones: AccessDenied: User: arn:aws-us-gov:sts::GOV_ACC_ID:assumed-role/eks-role-for-external-dns/ZZZ is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::COMMERCIAL_ACC_ID:role/role-for-route53\n\tstatus code: 403, request id: XXX"
or
time="2021-09-20T11:32:48Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.us-gov-west-1.amazonaws.com/id/ID_HERE\n\tstatus code: 400, request id: XXX"
If possible, could you please shed more light on the details of the configuration you had, e.g. did you use --aws-assume-role
on external-dns side (which else), what's the value of eks.amazonaws.com/role-arn
annotation, does Cognito need to be set up somehow additionally, how does your successful output log looks like? I feel as I'm missing a small nuance, but do not get where.
UPDATE: I was able to make it work after being stuck with "No OpenIDConnect provider found in your account for..." error. Turned out, as opposite to the official documentation to set AWS_REGION to 'us-gov-west-1' in external-dns deployment's env, it should be us-east-1 instead, so that Global STS endpoint is invoked and not the regional (with which I get the above error). This looks as either AWS or external-dns bug for me. For my configuration, having specified AWS_REGION as us-east-1 seems to be ok for now.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Here are a few notes on getting this to work with EKS cluster in GovCloud and DNS in commercial account using OIDC
IN TARGET ACCOUNT Configure OIDC provider Use URL from source OIDC issuer
resource "aws_iam_openid_connect_provider" "commercial" { provider = aws.commercial url = module.eks.cluster_oidc_issuer_url client_id_list = ["sts.amazonaws.com"] thumbprint_list = [ data.tls_certificate.cluster.certificates.0.sha1_fingerprint ] tags = merge(local.tags, { Name = "${var.application}-${var.environment}-${var.comm_region}-external-dns", Region = var.comm_region }) }
create assumed role Assume Role Policy Trusts the target account OIDC Provider For external-dns Kubernetes service, the role needs permissions to do sts:AssumeRole on itself Add additional policy for any functional permissions you need (e.g. DNS manipulation).
Use the following assume role policy
data "aws_iam_policy_document" "eks_external_dns_assumerole_policy" { statement { actions = ["sts:AssumeRoleWithWebIdentity"] principals { type = "Federated" identifiers = [ aws_iam_openid_connect_provider.commercial.arn ] } condition { test = "StringEquals" variable = "${trimprefix(module.eks.cluster_oidc_issuer_url,"https://")}:sub" values = ["system:serviceaccount:kube-system:external-dns"] } sid = "externalDNS" } }
IN THE SOURCE (EKS) ACCOUNT The entity doing the assumption needs permission to assume the role (replace the arn with the assume role arn in the target account)
{ "Sid": "AssumeExternalDNS", "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "*", "Condition": { "StringEquals": { "iam:AssociatedResourceArn": "arn:aws:iam::111111111111:role/build-test-us-east-1-external-dns" } } }
@FixItDad was wondering if you had possibly a full example/gist of this? data.tls_certificate.cluster.certificates.0.sha1_fingerprint
. I'm trying to do something similar without GovCloud but just 2 AWS accounts and in can use IRSA to assume role in the current EKS account with the current clusters OIDC but I need to use Route53 in the "shared" account. Normally I would do this on cli by assuming a role after the IRSA has assumed one but I can't seem to figure out how to pass another assume role to external-dns even though it looks like it has an AWSConfig.AssumeRole option.
I got this working following the "Create an identity provider from another account's cluster" method described in the AWS docs.
Our use case is we have a cluster in gov cloud and needed cross account IAM role access to allow the cluster's external-dns to create records in our public cloud account's Route53.
Here's our terraform for the public account where Route53 is being used:
resource "aws_iam_openid_connect_provider" "gov_eks_oidc" {
url = "<< OIDC issuer URL of govcloud EKS cluster >>"
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = ["<< Thumbprint of EKS OIDC issuer >>"]
}
resource "aws_iam_role" "external_dns_assumed_role" {
name = "external-dns-assumed-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.gov_eks_oidc.arn
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition : {
StringEquals : {
"${trimprefix(aws_iam_openid_connect_provider.gov_eks_oidc.url, "https://")}:aud" : "sts.amazonaws.com"
"${trimprefix(aws_iam_openid_connect_provider.gov_eks_oidc.url, "https://")}:sub" : "system:serviceaccount:external-dns:external-dns-controller" # This needs to match your external-dns SA in the gov cluster
}
}
}
]
})
}
resource "aws_iam_role_policy" "external_dns_policy" {
role = aws_iam_role.external_dns_assumed_role.id
policy = file("policies/external-dns.json")
}
And in our gov cloud account, the Service Account for external-dns has the following annotations:
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns-controller
namespace: external-dns
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<< PUBLIC_ACCOUNT_ID >>:role/external-dns-assumed-role
eks.amazonaws.com/sts-regional-endpoints: "true"
Note that we did not have to create any additional roles in the gov cloud account, our service account can assume the public role directly via the above trust relationship.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Any plans to support Route53 GovCloud? Endpoints: https://docs.aws.amazon.com/govcloud-us/latest/UserGuide/using-govcloud-endpoints.html
Aliases are not supported in Route53 GovCloud -- CNAMES would have to be utilized in replacement.