kubernetes-sigs / external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Apache License 2.0
7.49k stars 2.53k forks source link

TXT records created for aliases in AWS Route 53 have wrong record type prefix #2903

Open seh opened 2 years ago

seh commented 2 years ago

What happened:

Using the "aws" provider to create DNS records for hostnames that point at AWS ELBs (such as for endpoints extracted from a Kubernetes Service or Ingress), since the hostnames don't parse as IP addresses, ExternalDNS considers the endpoints warrant a record of type CNAME. As the target hostname discovered from the Ingress's status sits within a canonical hosted zone, ExternalDNS decides that the record should be an alias to the target ELB's DNS record. Later, when composing the changes to send to the Route 53 service, ExternalDNS changes its mind and decides to use an A record instead. At that point, ExternalDNS leaves the endpoint.Endpoint's "RecordType" field's value as the original endpoint.RecordTypeCNAME ("CNAME").

That sets us up to create an A record for an endpoint.Endpoint that still represents a CNAME record. ExternalDNS then goes on to add the TXT ownership records to the change batch, and consults the endpoint.Endpoint's "RecordType" field, finding it to be "CNAME." This leads to a TXT record prefix of "cname-" even though it should probably be "a-" instead, if the goal is to have the TXT records indicate which of several primary records they describe.

What you expected to happen:

ExternalDNS will create a TXT record with a prefix indicating the same primary record type that the TXT record describes. In this case, since the primary record type created in Route 53 turns out to be A, I expect the TXT record's prefix to be "a-" instead of "cname-."

How to reproduce it (as minimally and precisely as possible):

In a Kubernetes cluster running within AWS EC2, create a Service of type "LoadBalancer," and allow ExternalDNS to discover the endpoint and its target by using either the "service" or "ingress" source.

Inspect the Route 53 service to see that ExternalDNS creates a primary record of type A, as an alias to the target AWS-hosted load balancer. Note too that ExternalDNS creates a TXT record with a prefix of "cname-" instead of "a-."

Anything else we need to know?:

In order to align the record type mentioned by these primary and TXT records, we need to make the TXT registry portion of ExternalDNS aware of the late decision that the AWS provider makes to use an A record instead. I am not sure whether other providers make similar overriding decisions when composing changes.

Environment:

doctornkz commented 1 year ago

Also faced with that issue. Thank you @seh for report.

chonton commented 1 year ago

What I see is two guard records being produced; one with same name as 'A' record and one with 'cname-' prefix.

seh commented 1 year ago

That's odd. Does your "A" record's name happen to begin with "a-," inducing false aliasing?

chonton commented 1 year ago

Version v0.12.2

Args

      containers:
      - args:
        - --log-level=info
        - --namespace=mis-feature
        - --publish-host-ip
        - --aws-batch-change-size=20
        - --domain-filter=mis.example.com
        - --interval=2m
        - --policy=upsert-only
        - --provider=aws
        - --source=ingress
        - --source=service
        - --registry=txt
        - --txt-owner-id=use-feature

Redacted Kubernetes Resources

---
apiVersion: v1
kind: Service
metadata:
  name: unified-theatre
  annotations:
    external-dns.alpha.kubernetes.io/alias: "true"
    external-dns.alpha.kubernetes.io/hostname: us.example.com
    external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
    external-dns.alpha.kubernetes.io/aws-weight: "255"
    external-dns.alpha.kubernetes.io/set-identifier: us-east-1
spec:
  type: ExternalName
  externalName: use.example.com
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: unified-region
  annotations:
    external-dns.alpha.kubernetes.io/alias: "true"
    external-dns.alpha.kubernetes.io/hostname: use.example.com
    external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
Redacted Route53 records Record name Type Policy Weight Value/Route traffic to
cname-us-feature.mis.example.com TXT Weighted 255 "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=service/mis-feature/unified-theatre"
cname-use-feature.mis.example.com TXT Simple - "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=ingress/mis-feature/unified-region"
use.feature.mis.example.com A Simple - 10.93.177.118
us-feature.mis.example.com A Weighted 255 use-feature.mis.example.com.
us-feature.mis.example.com TXT Weighted 255 "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=service/mis-feature/unified-theatre"
use-feature.mis.example.com A Simple - internal-k8s-misfeatu-unifiedr-201835383a-1018808261.us-east-1.elb.amazonaws.com.
use-feature.mis.example.com TXT Simple - "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=ingress/mis-feature/unified-region"
jwilf commented 1 year ago

I'm also seeing this, and an additional problem is that when the k8s resource is deleted, the TXT record with prefix "cname-" is not deleted from route53. We have a zone with a large churn of resources and this resulted in reaching the limit on the number of records in the zone.

nicocout commented 1 year ago

I have the same problem. TXT records with prefix "cname-" are not deleted and cause an issue when I try to recreate k8s resources.

erikdeweerdt commented 1 year ago

We're seeing similar, but subtly different behavior: external-dns tries to delete cname- prefixed TXT records that were never created, failing the entire change batch and preventing all future updates until we manually intervene (by creating the record it wants to delete).

dalvarezquiroga commented 1 year ago

+1 with the same problem in AWS. External DNS created a lot of entries in Route53 that start with CNAME-{{name}}.local TXT

Gladdstone commented 1 year ago

Having recently come across this issue, it appears part of the problem with the creation of erroneous cname- prefixed TXT records has to do with the construction of the plan struct and how that is then passed to the registry and onward to the cloud provider. The plan is comprised of create/update/delete arrays, and so the actual records have no association to each other insofar as the registry or the provider are aware. The changes to the endpoints are read and executed in order, resulting in AWS (or another supporting cloud provider) correctly recognizing a record identified as CNAME as an Alias, but still creating the cname- TXT record that was generated by the TXTRegistry in the prior stage of execution. The registry is aware of the provider, because it has to call the ApplyChanges function as part of its own. Barring a total overhaul of how aliases are handled, I wonder would it be possible to call a function from the registry level down to the provider to check for an alias e.g. AWSProviders useAlias function?

vitali-federau-fivestars commented 1 year ago

+1 experiencing the same issue, while creation of A(Alias) records TXT record uses incorrect prefix (cname instead of a)

jbilliau-rcd commented 1 year ago

Same, highly annoying. I'm having to delete Route53 records on a daily basis for dozens of clusters in order for the controller to properly create all the relevant records and go healthy with "all records are up to date".

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

aaroniscode commented 1 year ago

/remove-lifecycle stale

johngmyers commented 1 year ago

External-dns represents ALIAS records of type A to the planner as Endpoints of type CNAME and a ProviderSpecific attribute with key alias and value true. So it is an expected quirk that the new-format txt registry ownership records have a prefix of cname-. As the installed base has such ownership records, this would take an unreasonable amount of effort to change.

Problems with deletion would be separate bugs.

stefkkkk commented 7 months ago

So, will be any fix of that behaviour? I need to pin the tag version(0.11.1-debian-10-r27) due to this

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jcogilvie commented 3 months ago

/remove-lifecycle rotten

stefkkkk commented 3 months ago

any updates?!

k8s-triage-robot commented 3 weeks ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

stefkkkk commented 3 weeks ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

/remove-lifecycle stale