kubernetes-sigs / external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Apache License 2.0
7.64k stars 2.55k forks source link

On Linode, external-dns is creating CNAME alias and no TXT/A Records when target is Load Balancer #2499

Closed peedrr closed 2 years ago

peedrr commented 2 years ago

What happened:

time="2022-01-01T14:45:11Z" level=info msg="Creating record." action=Create > record=juicy type=TXT zoneID=1770931 zoneName=mydomain.com

time="2022-01-01T14:45:11Z" level=error msg="Failed to Create record: [400] [name] Record conflict - CNAMES must be unique" action=Create record=juicy type=TXT zoneID=1770931 zoneName=mydomain.com


**What you expected to happen**:
A & TXT Records created, as per [Linode's tutorial video](https://youtu.be/wLHegOz_aR4?t=930) which uses the same commands (below)

**How to reproduce it (as minimally and precisely as possible)**:

(Assumes existing access to Linode, Linode API key and Bitnami helm repo, with _mydomain.com_ managed by Linode DNS)
1. Create a cluster on Linode Kubernetes Engine
2. Deploy nginx

helm upgrade --install nginx bitnami/nginx

3. Deploy external-dns

helm upgrade --install external-dns bitnami/external-dns \
--namespace external-dns --create-namespace \ --set provider=linode \
--set linode.apiToken=$LINODE_API_TOKEN

4. Annotate nginx service

kubectl annotate service nginx \
external-dns.alpha.kubernetes.io/nginx=nginx.mydomain.com


5. Check Linode web gui for changes to DNS and be sad

**Anything else we need to know?**:
Not sure if this is related, but I saw in FAQs that [external-dns creates CNAME "when target looks like an ELB"](https://github.com/kubernetes-sigs/external-dns/blob/master/docs/faq.md#can-i-force-externaldns-to-create-cname-records-for-elbalb) on AWS. The FAQ then gives AWS-specific solution which obviously doesn't work on Linode. Maybe something similar is happening here?

**Environment**:
- External-DNS version (use `external-dns --version`): 0.10.2
- DNS provider: Linode
- Others: Linode Kubernetes Engine (LKE) plus Bitnami's Helm chart deployments
peedrr commented 2 years ago

Having chatted with @jpetazzo (creator of Linode's tutorial mentioned above), it does look like the reason might be because of external-dns's logic to create a CNAME if the target looks like an ELB. Setting the --txt-prefix flag, as per #262, gets around the issue of the TXT not being created.

I believe that the logic of detecting an ELB should be changed. If the LoadBalancer has an IP address and not a domain name, this logic/change from A Record to CNAME does not seem necessary. Furthermore, applying the --txt-prefix is prone to user error; if for some reason this prefix was not applied in the future, or applied incorrectly (e.g. the prefix changed), external-dns would lose track of the domains it is managing.

mars64 commented 2 years ago

I think I'm hitting a similar issue after upgrading my LKE cluster from 1.20 to 1.22. I'm using traefik 2.5.6 and external-dns 0.10.2.

When I enable debug logs I see:

time="2022-01-18T15:07:57Z" level=debug msg="Endpoints generated from service: default/traefik: [mydomain.com 0 IN A  x.x.x.x [] mydomain.com 0 IN CNAME  x-x-x-x.ip.linodeusercontent.com []]"
time="2022-01-18T15:07:57Z" level=info msg="Creating record." action=Create record= type=CNAME zoneID=xxxxxx zoneName=mydomain.com
time="2022-01-18T15:07:57Z" level=error msg="Failed to Create record: [400] [name] Invalid hostname " action=Create record= type=CNAME zoneID=xxxxxx zoneName=mydomain.com
time="2022-01-18T15:07:57Z" level=info msg="Creating record." action=Create record=external-dns type=TXT zoneID=xxxxxx zoneName=mydomain.com

Note the empty record= there. A TXT record is repeatedly created for as long as I let it run, but neither the A nor CNAME are created. That both the IP and Hostname are enumerated by external-dns is interesting to me, I'm curious what others might see here.

I noticed the new domain on my new nodebalancer and eventually found https://www.linode.com/blog/linode/notice-changes-to-members-linode-com-and-nodebalancer-linode-com/ -- My cluster had been up for over a year so I hadn't noticed earlier. Seems like it could be related.

Prior to this I'd run into https://github.com/kubernetes-sigs/external-dns/issues/961 -- I've verified the RBAC rules appear to be correct, and --txt-prefix doesn't seem to fix it (as you can see above i've set the value to external-dns.). Everything is running in default namespace and I've tried explicitly limiting to --source=service (to rule out crd availability and such).

ghost commented 2 years ago

same behaviour for me than @peedrr described it.. is there a fix planned?

Leedwing commented 2 years ago

For me too, same behaviour than @peedrr described it.

displague commented 2 years ago

Is there an annotation to force an A record rather than a CNAME record? You can't use a CNAME at the base of a domain and external-dns has no way to know if a record will be the base or not. A user hint seems the only way out.

mars64 commented 2 years ago

I hope I'm not muddying the waters in case my issues are unrelated, but I'm not seeing that it's a unique issue just yet.

In my case, I'm running traefik ingress, with these service annotations:

  annotations:
    external-dns.alpha.kubernetes.io/hostname: mydomain.com.
    meta.helm.sh/release-name: traefik
    meta.helm.sh/release-namespace: default

In case its relevant, the service is also configured with:

.spec.externalTrafficPolicy: Local

Before upgrading my clusters (1.20)/nodebalancer, external-dns seemed to work as expected with k8s.gcr.io/external-dns/external-dns:v0.7.4 (upgraded to v0.10.2, didn't work, reverted, still didn't work).

Runtime configs:

        - --log-level=debug
        - --log-format=text
        - --interval=1m
        - --source=ingress
        - --source=service
        - --policy=sync
        - --registry=txt
        - --txt-owner-id=<my-id>
        - --txt-prefix=external-dns.
        - --domain-filter=mydomain.com
        - --provider=linode

... passing LINODE_TOKEN via env var from a secret, which appears to work because repeat TXT ownership records are created.

You can't use a CNAME at the base of a domain and external-dns has no way to know if a record will be the base or not.

At least in my case I never intended to create a CNAME record at the root domain (only that an A record be created at the given hostname), and as far as I can tell the Linode provider didn't do this before the nodebalancer changes/cluster upgrades.

Is there an annotation to force an A record rather than a CNAME record?

Not that I could find: https://github.com/kubernetes-sigs/external-dns/blob/master/source/source.go#L40

wbh1 commented 2 years ago

The root of this issue, as mentioned before, is that you need to add a prefix or suffix to the TXT record so that it has a unique hostname. In the official Helm chart, this is accomplished via something like:

spec:
  values:
    registry: txt
    txtOwnerId: ""
    txtPrefix: ""
    txtSuffix: "_externaldns"

After hitting the issue described, the above fixed my issues on LKE 1.22 (for services, at least -- haven't tested with ingress like @mars64 was mentioning).

The same principals from the AWS FAQ entry you mentioned apply. More discussion is also in this StackOverflow.

I do agree that the auto-detection logic should probably be changed, though.

wbh1 commented 2 years ago

After looking into this more, the behavior has definitely changed since the rollout of the new subdomain for nodebalancers.

Using --source=service as an example: when external-dns runs, it creates Endpoints for both the hostname (CNAME) and the ip (A) returned in the status.loadBalancer.ingress section of your kube service. Then, it uses a simple string comparison to determine which is "less". This is where the behavior has changed in which type of record you get.

Now, a target like "23-92-23-89.ip.linodeusercontent.com" is considered less than a target like "23.92.23.89". Therefore, a CNAME gets created.

Previously, the hostname of the svc would've been something like "nb-23-92-23-89.dallas.nodebalancer.linode.com" which is considered to be more than "23.92.23.89". Therefore, an A record gets created.

Go playground to demonstrate it.

Really weird behavior. Still looking more into it, but hopefully this clarifies why it's happening.

EDIT: it's because string comparisons like greater/less than compare alphabetically (lexicographically). Since - is unicode 45 and . is unicode 46, 23- is "alphabetically" before 23.

displague commented 2 years ago

// FIXME We really need to define under which circumstances a list Targets // is considered 'less' than another.

Indeed. This seems like behavior that should be configurable (to choose the service IP or choose the hostname (CNAME)).

ddurham2 commented 2 years ago

I think this might be related. I've recently found that when adding the service.beta.kubernetes.io.do-loadbalancer-hostname annotation to external-dns's service, it causes external-dns to create CNAMEs rather than A records.

Perhaps that because this setting replaces status.LoadBalancer.ingress.ip="1.2.3.4" with status.LoadBalancer.ingress.hostname="example.com" and it doesn't have an IP address to which to point A records.

That's a digitalocean specific setting, but I suspect there are equivalents for other providers?

Is that behavior expected, and likely an unavoidable consequence?