kubernetes-sigs / external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Apache License 2.0
7.76k stars 2.58k forks source link

OCI: external DNS is not working with network load balancer #4828

Closed blauwaldt-it closed 2 weeks ago

blauwaldt-it commented 1 month ago

What happened: I used K8s, external-dns and loadbalancer on OKE (Oracle Cloud Infrastructure) - all worked fine.

I tried to replace the loadbalancer by a "network load balancer" as described in https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengcreatingnetworkloadbalancers.htm.

Then external-dns (setup as described in https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/oracle.md) did not work anymore:

external-dns seems to set both, private and public ip of the netwok loadbalancer in one DNS A Record. This leads to error:

kubectl logs external-dns-b767c5645-cmsjm [...] time="2024-10-20T15:20:01Z" level=info msg="{ Domain=somedomain.tld RecordHash= IsProtected= Rdata=10.0.20.152 79.76.102.70 RrsetVersion= Rtype=A Ttl=300 Operation=ADD }" time="2024-10-20T15:20:01Z" level=info msg="{ Domain=somedomain.tld RecordHash= IsProtected= Rdata=10.0.20.152 79.76.102.70 RrsetVersion= Rtype=A Ttl=300 Operation=ADD }" time="2024-10-20T15:20:01Z" level=info msg="{ Domain=xdns-somedomain.tld RecordHash= IsProtected= Rdata=\"heritage=external-dns,external-dns/owner=someone,external-dns/resource=ingress/someone/any-www-ingress\" RrsetVersion= Rtype=TXT Ttl=300 Operation=ADD }" time="2024-10-20T15:20:01Z" level=info msg="{ Domain=xdns-a-somedomain.tld RecordHash= IsProtected= Rdata=\"heritage=external-dns,external-dns/owner=someone,external-dns/resource=ingress/someone/any-www-ingress\" RrsetVersion= Rtype=TXT Ttl=300 Operation=ADD }" time="2024-10-20T15:20:01Z" level=error msg="Failed to do run once: soft error\nError returned by Dns Service. Http Status Code: 400. Error Code: InvalidParameter. Opc request id: (someid). Message: Record (somedomain.tld, A) contained invalid rdata (10.0.20.152 79.76.102.70)\nOperation Name: PatchZoneRecords\n [...]

What you expected to happen: external-dns working with network loadbalancer as is does with non-network loadbalancer

Environment:

ddevadat commented 2 weeks ago

i am also observing the same issue. I am wondering what should be the expectation? e.g in the example above, the nlb created two ip private ip: 10.0.20.152 and public ip: 79.76.102.70.

what is the expectation from the user side as far as the zone record update is concerned.

  1. should it update only the private ip from the list?
  2. Should it update only the public ip from the list?
  3. Should it update both?

All are valid cases, but consider this if the zone record somedomain.tld is meant to be accessed publicly , it should have only public ip. Because we somehow added both private and public ip, if the dns returns the private ip the call will fail.

i am thinking , should there be another setup flag something like --oci-rdata-ip-type=public|private|both

I dont think this problem will come if the loadbalancer type is nlb but its private nlb instead of public

ddevadat commented 2 weeks ago

On further reading the solution is to start the external dns pod with the value --target-net-filter=10.0.0.0/8 ( if you want to include only the private ip in the zone record )

or --exclude-target-net=10.0.0.0/8 ( if you want to include only the public ip in the zone record )

I am assuming your vcn cidr range is 10.0.0.0/16

anders-swanson commented 2 weeks ago

@blauwaldt-it @ddevadat does the following resolve your issue? https://github.com/kubernetes-sigs/external-dns/blob/master/docs/faq.md#how-do-i-specify-that-i-want-the-dns-record-to-point-to-either-the-nodes-public-or-private-ip-when-it-has-both

ddevadat commented 2 weeks ago

it works for me as expected by using filter. I will leave to author @blauwaldt-it to comment on issue resolution

jrosinsk commented 2 weeks ago

Given that by using the available startup flags, it is possible to force one or the other:

if you want to include only the private ip in the zone record --target-net-filter=10.0.0.0/8 or if you want to include only the public ip in the zone record --exclude-target-net=10.0.0.0/8

The remaining problem then is the default condition, which results in the failure mentioned previously in this issue.

To best align with what is mentioned in the documentation

"If this annotation is not set, and the node has both public and private IP addresses, then the public IP will be used by default."

would be to adhere to this by removing any private IP if both are present to avoid the service failure and behave in an expected way for the default state.

If there are no objections, this issue can be assigned to me and I'll put in a PR for the suggested fix.

blauwaldt-it commented 2 weeks ago

Yes, if I use --exclude-target-net=10.0.0.0/8, everything works as expected.

Thank you very much, unfortunately Oracle support was not helpful with this issue!