Pod crash on aws route53

ahoehma commented 3 months ago

What happened:

Suddenly someone in the team deployed an ingress with aws elb and a hostname and this seems to crash the whole external-dns pod.

time="2024-03-22T05:13:46Z" level=error msg="Failure in zone spice.eb-dev.siemens.cloud. [Id: /hostedzone/XXXXXXXXXXXXXXXXX] when submitting change batch: InvalidChangeBatch: [Tried to create reso │ │ time="2024-03-22T05:13:46Z" level=error msg="Failed submitting change (error: InvalidChangeBatch: [Tried to create resource record set [name='cin-info-service.spice.eb-dev.xxxxx.', type='TXT'] │ │ time="2024-03-22T05:13:47Z" level=error msg="Failed submitting change (error: InvalidChangeBatch: [Tried to create resource record set [name='clm-eks.spice.eb-dev.xxxxx.', type='TXT'] but it a │ │ time="2024-03-22T05:13:48Z" level=fatal msg="failed to submit all changes for the following zones: [/hostedzone/XXXXXXXXXXXXXXXXX]"

What you expected to happen:

I would like to see more details from the pod log.

Would be cool if the external-dns don't crash because of such error. May this is normal that such errors happens?

How can I prevent this for a production system?

How to reproduce it (as minimally and precisely as possible):

Not sure. Could it be that the entries in route53 already there .. may from a previous deployment

Anything else we need to know?:

I installed external-dns via terraform aws-ia/eks-blueprints-addons/aws 1.16.1

Environment:

External-DNS version (use external-dns --version): 0.14.0
External-DNS Helm Chart version: 1.14.3
DNS provider: aws route53
Others: aws eks

CAR6807 commented 3 months ago

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

kubernetes-sigs / external-dns

Pod crash on aws route53 #4330