kubernetes-sigs / external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Apache License 2.0
7.71k stars 2.57k forks source link

OpenStack Designate provider: failure to update existing recordset #603

Closed dhague closed 4 years ago

dhague commented 6 years ago

When trying to update the IP address of an existing record, external-dns fails to do so and displays the log entries:

time="2018-06-15T15:04:19Z" level=info msg="Creating records: training.k8s-playground.c.eu-de-2.cloud.sap./A: 10.236.96.161"
time="2018-06-15T15:04:19Z" level=error msg="Expected HTTP response code [201 202] when accessing [POST https://dns-3.eu-de-2.cloud.sap/v2/zones/ed3d348c-bc97-46ad-a28e-03205587c1b9/recordsets], but got 409 instead\n{\"message\": \"Duplicate RecordSet\", \"code\": 409, \"type\": \"duplicate_recordset\", \"request_id\": \"req-b1bbe4d1-a2c5-4cff-ba39-bbfd9f213a8a\"}"

Both A and TXT records are correctly created in the first place, and the configured user can list the domain's recordsets using the openstack CLI.

njuettner commented 6 years ago

I think this is related to the caching issue which we rolled out in the latest version:

Could you please try to see if it works with this image: registry.opensource.zalan.do/teapot/external-dns-test:pr-612-2

njuettner commented 6 years ago

New version is out which fixes the issue. https://github.com/kubernetes-incubator/external-dns/releases/tag/v0.5.4

dhague commented 6 years ago

Unfortunately 0.5.4 does not fix the issue (apologies for not testing the PR, I've been busy). When updating the IP address of a Service monitored by external-dns I still see the same:

time="2018-06-28T15:37:57Z" level=info msg="Creating records: training.k8s-playground.c.eu-de-2.cloud.sap./A: 10.236.96.46"
time="2018-06-28T15:37:57Z" level=info msg="Creating records: training.k8s-playground.c.eu-de-2.cloud.sap./TXT: \"heritage=external-dns,external-dns/owner=default,external-dns/resource=service/kube-system/ingress-nginx\"" 
time="2018-06-28T15:37:57Z" level=error msg="Expected HTTP response code [201 202] when accessing [POST https://dns-3.eu-de-2.cloud.sap/v2/zones/ed3d348c-bc97-46ad-a28e-03205587c1b9/recordsets], but got 409 instead\n{\"message\": \"Duplicate RecordSet\", \"code\": 409, \"type\": \"duplicate_recordset\", \"request_id\": \"req-6d370d58-81b5-4869-99df-ee55efa98192\"}"

This is using image registry.opensource.zalan.do/teapot/external-dns:v0.5.4

dhague commented 6 years ago

FWIW, I am starting the container with these args:

--log-level=info
--policy=upsert-only
--provider=designate
--registry=txt
--source=service 

I am using Helm for deployment, using the stable/external-dns chart, updated with image.tag: 0.5.4 and adding nodes to the resources in its clusterrole.yaml template to fix an authz error I saw in the log.

njuettner commented 6 years ago

Could you try to enable debugging using--log-level=debug and comment the output again?

dhague commented 6 years ago

Here's the debug output after I updated the IP address on the service kube-system/ingress-nginx:

time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service default/kubernetes"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service kube-system/default-http-backend"
time="2018-06-28T16:20:18Z" level=debug msg="Endpoints generated from service: kube-system/ingress-nginx: [ingress-nginx.k8s-playground.c.eu-de-2.cloud.sap 0 IN A 10.236.98.191]"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service kube-system/kube-dns"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service kube-system/kube-system-external-dns-external-dns"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service kube-system/kubernetes-dashboard"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service kube-system/tiller-deploy"
time="2018-06-28T16:20:18Z" level=debug msg="No endpoints could be generated from service training-service/minio-nginx-default-backend"
time="2018-06-28T16:20:18Z" level=info msg="Creating records: ingress-nginx.k8s-playground.c.eu-de-2.cloud.sap./A: 10.236.98.191"
time="2018-06-28T16:20:18Z" level=error msg="Expected HTTP response code [201 202] when accessing [POST https://dns-3.eu-de-2.cloud.sap/v2/zones/ed3d348c-bc97-46ad-a28e-03205587c1b9/recordsets], but got 409 instead\n{\"message\": \"Duplicate RecordSet\", \"code\": 409, \"type\": \"duplicate_recordset\", \"request_id\": \"req-fbdf23ef-9fed-4e04-b420-fd4ada887c3a\"}" 
sebastien-prudhomme commented 6 years ago

Issue still there in master branch.

Trying to debug: the endpoint labels defined for designateRecordSetID and designateZoneID in Records() function doesn't more exist when trying to read them in addEndpoint() function.

Seems the labels are reseted outside the designate.go source code.

sebastien-prudhomme commented 6 years ago

@njuettner In the Records function of the txt registry, endpoint labels of the record are replaced by the labels of the TXT record:

        for _, ep := range endpoints {
                if labels, ok := labelMap[ep.DNSName]; ok {
                        ep.Labels = labels

A merge of the labels should correct the bug.

sebastien-prudhomme commented 6 years ago

Merging the endpoints labels works for updating records. But there is still an issue with the TXT records. External-dns tries to create them instead of updating them:

That is because ApplyChanges() of the TXT registry creates TXT endpoints from scratch, instead of reusing what is returned as current records by the provider.

So the labels for designateRecordSetID and designateZoneID labels for the TXT records don't exist anymore.

@njuettner I've seen that the endpoint struct have a ProviderSpecific field. What is it used for?

njuettner commented 6 years ago

@sebastien-prudhomme When the PR was opened, there was already a use case for CloudFlare configuration which is not used by every provider, see https://github.com/kubernetes-incubator/external-dns/pull/650

JrCs commented 5 years ago

The bug is always there. This is because the endpoints return by the Records() function are not preserved when the ApplyChanges() function is call. In particular the endpoints Labels map was completely rewritten. Perhaps an interface to the endPoints will be better with a specific structure for each providers.

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

Toriniasty commented 5 years ago

Hi, any chance that this issue is going to be fixed soon?

Many thanks!

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

multi-io commented 5 years ago

https://github.com/kubernetes-incubator/external-dns/pull/1136 fixes TXTRegistry to retain the endpoint labels between Records() and ApplyChanges() calls on the provider, which fixes this for us for the most part (there's still one 409 per change happening for the TXT record, but the A record is updated correctly and all subsequent reconciliations go though without errors).

I'm sure this is not the right/best patch for this. This "remember stuff in labels" approach of the designate provider seems unusual; no other provider does this.

matthiasbernhardt commented 5 years ago

Does the presence of a new pull request rectify to revive the issue? /remove-lifecycle rotten

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

stefanandres commented 4 years ago

With https://github.com/kubernetes-sigs/external-dns/blob/master/CHANGELOG.md#v0517---2019-09-17 got #1136 merged which makes it at least usable. So I guess this issue here can be closed.

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 4 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 4 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/external-dns/issues/603#issuecomment-577749073): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
frittentheke commented 2 years ago

/reopen

k8s-ci-robot commented 2 years ago

@frittentheke: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to [this](https://github.com/kubernetes-sigs/external-dns/issues/603#issuecomment-1258645636): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
frittentheke commented 1 year ago

@njuettner could you kindly reopen this issue as it's still broken and there is a PR fixing this https://github.com/kubernetes-sigs/external-dns/pull/3049 (requires https://github.com/kubernetes-sigs/external-dns/pull/3024 to be done first I believe)