Federated service IP out of sync between AWS LB and CloudDNS records

irfanurrehman commented 6 years ago

Issue by juntagabor Wednesday Oct 26, 2016 at 16:09 GMT Originally opened as https://github.com/kubernetes/kubernetes/issues/35637

Is this a request for help? No

What keywords did you search in Kubernetes issues before filing this one? federation dns federation aws federation aws cname

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): v1.4.4

Environment:

Cloud provider or hardware configuration: GCE for federation control plane AWS for cluster1 (federated) GCE for cluster2 (federated)
OS (e.g. from /etc/os-release): NAME="Debian GNU/Linux" VERSION_ID="7" VERSION="7 (wheezy)"
Kernel (e.g. uname -a): 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
Install tools:
Others:

What happened: Federation controller creates the appropriate A entries for all federated services on Google CloudDNS, including resolving AWS load balancer DNS name to IPs. But as IP changes for AWS LBs, the A records get outdated, causing service disruption as they become out of sync.

What you expected to happen: I expected the federation controller to either add a CNAME pointing to AWS LB, or to keep the IPs for AWS in sync overtime.

How to reproduce it (as minimally and precisely as possible):

Deploy a federation control plane at GCE
Create cluster at AWS
Add AWS cluster to federation
create a federated service
wait 24h (or until AWS Load Balancer IPs change)
Cloud DNS will be pointing to different IPs than AWS LB.

Anything else do we need to know: AWS recommends:

Because the set of IP addresses associated with a LoadBalancer can change over time, you should never create an "A" record with any specific IP address.

irfanurrehman commented 6 years ago

Comment by nikhiljindal Friday Oct 28, 2016 at 00:57 GMT

cc @kubernetes/sig-cluster-federation @quinton-hoole

irfanurrehman commented 6 years ago

Comment by quinton-hoole Friday Oct 28, 2016 at 15:25 GMT

@jungtagabor thanks for the comprehensive bug report. Yes, you're right, we should be publishing CNAME's for AWS LB's, not A records.

The bug is here:

https://github.com/kubernetes/kubernetes/blob/master/federation/pkg/federation-controller/service/dns.go#L185

in case someone wants to have a go at it before I get to it. The code is pretty self-explanatory, and the fix should be pretty trivial.

irfanurrehman commented 6 years ago

Comment by juntagabor Friday Oct 28, 2016 at 16:09 GMT

@quinton-hoole thanks for the response and pointing out where the issue is.

Looking more into this, it looks a little more complex. I think the decision to resolve AWS LB to IPs before creating the CloudDNS entry was because it is not possible to have A and CNAME records, or multiple CNAME records, for the same DNS entry (federated service in this case). So it is not possible to have

service.namespace.federation.svc.my.domain.com. (AWS) - CNAME
service.namespace.federation.svc.my.domain.com. (GCE) - A

Instead, Kubernetes resolves the CNAME and inserts the A records, but unfortunately that causes the out-of-sync situation.

irfanurrehman commented 6 years ago

Comment by saturnism Friday Nov 11, 2016 at 17:09 GMT

should it at least resync the IP? and use CNAME for the cluster-specific entry

irfanurrehman commented 6 years ago

Comment by nikhiljindal Monday May 22, 2017 at 19:17 GMT

Yes, ideally we want A records for GCP and CNAME for AWS, but since we cant have both A and CNAME records for same DNS entry, we need to choose either one for hybrid.

cc @madhusudancs

irfanurrehman commented 6 years ago

Comment by madhusudancs Monday May 22, 2017 at 19:51 GMT

You cannot have CNAME of AWS-only clusters either. CNAMEs can point to a single target, however we need our records to load balance between multiple targets and that's an illegal CNAME configuration. An alternative on AWS clusters is to use ALIAS records, but that works for AWS-only clusters and when DNS provider is Route53. This leaves us with 2 options:

If the dnsprovider is Route53, configure an ALIAS record for the global DNS name and point that to a combination of ELB names and CNAME records. I haven't verified if it is legal to configure ALIAS records in that way, hopefully it is. In the targets directly point to AWS service shards using their ELB names and configure CNAMEs for non-AWS service shards.
Simply run an IP resync logic in the service controller for service shards whose service ingress status has a hostname instead of an IP. Resync period could be set to DNS TTLs and probably configured via a command line flag to federated service controller/controller manager.

Latter one is more generic and involves running the control loop, often unnecessarily.

irfanurrehman commented 6 years ago

Comment by madhusudancs Monday May 22, 2017 at 19:54 GMT

About running control loops unnecessarily, I wonder if AWS Cloud Watch is capable of delivering events/notifications when IP addresses behind an ELB change. That would help us optimize the loop a bit. But on the other hand, it would be specific to AWS I guess.

irfanurrehman commented 6 years ago

Comment by nikhiljindal Monday May 22, 2017 at 19:56 GMT

Yes the second one is better since it works for hybrid as well.

irfanurrehman commented 6 years ago

Comment by baldeynz Wednesday Aug 16, 2017 at 02:07 GMT

Hi - has there been any update on this issue? any idea when its likely to be fixed?

irfanurrehman commented 6 years ago

Comment by quinton-hoole Wednesday Aug 16, 2017 at 16:36 GMT

@baldeynz No updates that I'm aware of. Code submissions to fix the bug would be well received.

irfanurrehman commented 6 years ago

Comment by quinton-hoole Friday Sep 08, 2017 at 01:07 GMT

Needs to be resolved for GA.

irfanurrehman commented 6 years ago

Comment by nikox94 Thursday Oct 12, 2017 at 22:58 GMT

Hey, if nobody can start on the issue, I would like to take it. Seems pretty interesting...

irfanurrehman commented 6 years ago

Comment by quinton-hoole Thursday Oct 12, 2017 at 23:10 GMT

Sure, that would be great @nikox94 . Welcome aboard!

irfanurrehman commented 6 years ago

Comment by nikox94 Sunday Oct 15, 2017 at 07:40 GMT

/assign

irfanurrehman commented 6 years ago

Comment by k8s-ci-robot Sunday Oct 15, 2017 at 07:40 GMT

@nikox94: GitHub didn't allow me to assign the following users: nikox94.

Note that only kubernetes members can be assigned.

In response to [this](https://github.com/kubernetes/kubernetes/issues/35637#issuecomment-336692679): >/assign Instructions for interacting with me using PR comments are available [here](https://github.com/kubernetes/community/blob/master/contributors/devel/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

irfanurrehman commented 6 years ago

Comment by nikox94 Friday Oct 20, 2017 at 07:02 GMT

Hey! :)

A small status update: It seems that relevant code is here and here.

I have only glanced at the code and am also still trying to understand the use-cases as I am new to the project. Could anybody shed some light on what a federated service DNS is supposed to be doing? Why should CloudDNS be pointing to the same IPs that AWS LBs point to?

If I understand correctly, in the original bug report and reproduction steps, the person has created a GKE cluster and an AWS cluster. They have a federated service accross the two clusters. LBs in GCP should thus be pointing to both the GKE instances and the AWS ones, and also the AWS LB should be pointing to both the AWS service instances (inside the cluster) as well as the GCP ones? The issue is then that the AWS LB gets "desynched" and does not point to the correct IPs anymore. I am making this up completely so, is it sensible?

Also I will work towards setting up and reproducing the bug, so I can have a chance of observing and fixing it.

Thanks!

P.S. If it's not too much of a bother, @quinton-hoole or @shashidharatd , could you please validate the above is in the correct direction. Thank you!

irfanurrehman commented 6 years ago

Comment by nikox94 Friday Oct 20, 2017 at 07:04 GMT

Also could you please assign the issue to me. Thank you!

irfanurrehman commented 6 years ago

Comment by shashidharatd Monday Oct 23, 2017 at 06:14 GMT

Hello nick, Here is design for federated service. You will get all the scenarios documented there and also the implementations. Coming to this issue, there is a technical limitation to the design vs reality in how the AWS load-balancers are implemented.

irfanurrehman commented 6 years ago

Comment by shashidharatd Monday Oct 23, 2017 at 14:09 GMT

missed the link https://github.com/kubernetes/community/blob/master/contributors/design-proposals/multicluster/federated-services.md :)

irfanurrehman commented 6 years ago

Comment by nikox94 Sunday Oct 29, 2017 at 20:48 GMT

OK so I have my setup up and running. If you read more closely the guy is having the cluster based in GCP and using CloudDNS for the federation DNS.

I now have set up a cluster in GKE with federation running on it and then I have a DNS zone in CloudDNS that I am using for the federation dns. I created a cluster in AWS and joined it, with a service exposed on a LB in AWS. Currently we have the scenario like this

AWS Load-balanced service:

 $ dig a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com
;; ANSWER SECTION:
a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com. 60 IN A52.41.41.59
a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com. 60 IN A52.25.127.72

CloudDNS from federation level:

serene_blog_7@macro-nuance-159207:~$ gcloud dns record-sets list --zone="the-dream-shack"
NAME                                                                     TYPE   TTL    DATA
thedreamshack.com.                                                       NS     21600  ns-cloud-e1.googledomains.com.,ns-cloud-e2.googledomains.com.,ns-cloud-e3.googledomains.com.,ns-cloud-e4.googledomains.com.
thedreamshack.com.                                                       SOA    21600  ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300
my-nginx.me.thedreamshack.com.                                           CNAME  1800   a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com.
my-nginx.default.fellowship.svc.thedreamshack.com.                       A      180    52.25.127.72,52.41.41.59
my-nginx.default.fellowship.svc.us-west-2.thedreamshack.com.             A      180    52.25.127.72,52.41.41.59
my-nginx.default.fellowship.svc.us-west-2a.us-west-2.thedreamshack.com.  A      180    52.25.127.72,52.41.41.59

irfanurrehman commented 6 years ago

Comment by nikox94 Sunday Oct 29, 2017 at 20:50 GMT

Now I will wait some hours or days for the AWS LB to change the underlying IPs for the federated service.

Also please bear in mind that the 3rd record from the top above was created by me and is meant to be the "solution". I.e. there is no reason in this case to create A records and one should simply create CNAME records and that's it.

I am not sure why at all we are creating A records... will have to find the code that does that and check it.

irfanurrehman commented 6 years ago

Comment by nikox94 Monday Oct 30, 2017 at 16:45 GMT

AWS ELB IPs have not yet changed. Will wait a bit more, if they don't change will look into a way of manually triggering a change.

irfanurrehman commented 6 years ago

Comment by nikox94 Monday Oct 30, 2017 at 21:59 GMT

OK so "we are in the game" as we say in Bulgaria. Issue reproduced!

Now AWS changed LB IPs:

$ dig a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com +trace

; <<>> DiG 9.10.3-P4-Ubuntu <<>> a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com +trace
;; global options: +cmd
.           280244  IN  NS  a.root-servers.net.
.           280244  IN  NS  h.root-servers.net.
.           280244  IN  NS  k.root-servers.net.
.           280244  IN  NS  j.root-servers.net.
.           280244  IN  NS  d.root-servers.net.
.           280244  IN  NS  g.root-servers.net.
.           280244  IN  NS  i.root-servers.net.
.           280244  IN  NS  l.root-servers.net.
.           280244  IN  NS  f.root-servers.net.
.           280244  IN  NS  c.root-servers.net.
.           280244  IN  NS  e.root-servers.net.
.           280244  IN  NS  b.root-servers.net.
.           280244  IN  NS  m.root-servers.net.
;; Received 1652 bytes from 127.0.1.1#53(127.0.1.1) in 0 ms

com.            172800  IN  NS  l.gtld-servers.net.
com.            172800  IN  NS  b.gtld-servers.net.
com.            172800  IN  NS  c.gtld-servers.net.
com.            172800  IN  NS  d.gtld-servers.net.
com.            172800  IN  NS  e.gtld-servers.net.
com.            172800  IN  NS  f.gtld-servers.net.
com.            172800  IN  NS  g.gtld-servers.net.
com.            172800  IN  NS  a.gtld-servers.net.
com.            172800  IN  NS  h.gtld-servers.net.
com.            172800  IN  NS  i.gtld-servers.net.
com.            172800  IN  NS  j.gtld-servers.net.
com.            172800  IN  NS  k.gtld-servers.net.
com.            172800  IN  NS  m.gtld-servers.net.
com.            86400   IN  DS  30909 8 2 E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
com.            86400   IN  RRSIG   DS 8 1 86400 20171112210000 20171030200000 46809 . uXJoWMQAgKQFAEvWCikbyKRvjkXEZ1PpKcouFSRMVyHW9GALDoR4s4hT RHTtrn/BfbL7jFzFSxsL9vLFs9ExkaRCB25r9Hzx271XT5ZJuOSQn5Rl ykQLJkIUfXXpVBt+dgzQzsNdtol9apQlfKb135a31qq4Z5KJh/a0zq7E 2Xgw6fhgOTkF5Hq8wxCiBL17+HvVYVQidYufLfpULu+SPUjpeSlPjh8X JhPZlzEeFP9CtYkH1ePUswMRWSG46sFE3bNo4lsQlhUXBktWcMmnuJeE 56RBKKVvEeGc06Mg2HGawlXNAf4R+YHjkyn4+zENpX/BMiYEU2zBaNXS CB6K6A==
;; Received 1230 bytes from 192.5.5.241#53(f.root-servers.net) in 29 ms

amazonaws.com.      172800  IN  NS  u1.amazonaws.com.
amazonaws.com.      172800  IN  NS  u2.amazonaws.com.
amazonaws.com.      172800  IN  NS  r1.amazonaws.com.
amazonaws.com.      172800  IN  NS  r2.amazonaws.com.
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN NSEC3 1 1 0 - CK0Q1GIN43N1ARRC9OSM6QPQR81H5M9A NS SOA RRSIG DNSKEY NSEC3PARAM
CK0POJMG874LJREF7EFN8430QVIT8BSM.com. 86400 IN RRSIG NSEC3 8 2 86400 20171105045018 20171029034018 11324 com. tUr5lk7lA/HTrs7fTVsG4IsUDY5vBVu8q8C+zac6NX/I6mIndhlw5uzR ap2fFLaEkhMV2D4Tf6Bonj4aqaFh5QxDMjVuVQwIGfuG0EDbNOSsH5Gq ud+ziOFfszPUpgqGFgMwvNHltTxtpJOVwSDMd81VXNMoIfUNbn0LTOOF imM=
F1RGCARTUOH1V344H263M3F4EHG6JI2L.com. 86400 IN NSEC3 1 1 0 - F1RNQU35LJSRBU32TLIABL5R3BQV3J82 NS DS RRSIG
F1RGCARTUOH1V344H263M3F4EHG6JI2L.com. 86400 IN RRSIG NSEC3 8 2 86400 20171104041824 20171028030824 11324 com. rFFjhgHbztB41+6iZ2BWGpdanWdDQfM+ATfJhWetsNsGOoCDLM8Dh1Fg 6R8j4IbVlhSQJskG/ZtlmUT349IQYVvN2hb4TiiZ2oQSr4ArgdFgiESx pZOuisbtffvVlzHzHaNhRNe/k672BRM+7atd6sMsuPnx8DsgjA1YKZNE DGY=
;; Received 716 bytes from 192.54.112.30#53(h.gtld-servers.net) in 52 ms

us-west-2.elb.amazonaws.com. 300 IN NS  ns-1475.awsdns-56.org.
us-west-2.elb.amazonaws.com. 300 IN NS  ns-1769.awsdns-29.co.uk.
us-west-2.elb.amazonaws.com. 300 IN NS  ns-332.awsdns-41.com.
us-west-2.elb.amazonaws.com. 300 IN NS  ns-560.awsdns-06.net.
;; Received 236 bytes from 205.251.192.27#53(r1.amazonaws.com) in 55 ms

a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com. 60 IN A52.38.9.41
a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com. 60 IN A52.38.14.25
us-west-2.elb.amazonaws.com. 1800 IN    NS  ns-1475.awsdns-56.org.
us-west-2.elb.amazonaws.com. 1800 IN    NS  ns-1769.awsdns-29.co.uk.
us-west-2.elb.amazonaws.com. 1800 IN    NS  ns-332.awsdns-41.com.
us-west-2.elb.amazonaws.com. 1800 IN    NS  ns-560.awsdns-06.net.
;; Received 268 bytes from 205.251.193.76#53(ns-332.awsdns-41.com) in 54 ms

This means new LB IPs are now 52.38.9.41, 52.38.14.25. However looking over at GCP we see that in CloudDNS:

my-nginx.default.fellowship.svc.us-west-2.thedreamshack.com.                A      180    52.25.127.72,52.41.41.59
my-nginx.default.fellowship.svc.us-west-2a.us-west-2.thedreamshack.com.     A      180    52.25.127.72,52.41.41.59
my-nginx.me.thedreamshack.com.                                              CNAME  1800   a9bec77a1bcdf11e7a6ca025c587eac3-248908204.us-west-2.elb.amazonaws.com.
my-nginx.default.fellowship.svc.thedreamshack.com.                          A      180    52.25.127.72,52.41.41.59

The A records created by Federation are still there, but now pointing to the wrong IP. We should instead have created CNAME records. The CNAME record I created for testing is still there and should redirect to the service correctly, but the A records should be broken.

Testing shows the CNAME is resolving correctly, however A records also are resolving correctly for the time being, but I think it's just because the IPs changed recently and AWS does not want to break any long-caching clients. Anyway the issue should be fixed. Starting to look for what code to change.

@shashidharatd and @quinton-hoole - how do we write a test case for this so that it's reproducible in the tests. E2E test? Is anybody writing those? What's our usual way of testing such scenarios?

irfanurrehman commented 6 years ago

Comment by shashidharatd Monday Nov 13, 2017 at 14:01 GMT

Hello @nikox94, What you tested above is using one cluster. But if you add another cluster to federation, then the records for my-nginx.default.fellowship.svc.thedreamshack.com. should contain all the instances of the federated service. This is not possible if you change the dns record of my-nginx.default.fellowship.svc.thedreamshack.com. from A to CNAME.

CNAME record can just point to one more A Record or another CNAME records according to RFC. it cannot contain point to multiple CNAME records.

irfanurrehman commented 6 years ago

Comment by nikox94 Wednesday Nov 15, 2017 at 07:46 GMT

So, after a short call with @shashidharatd, we reached the conclusion that implementing a full-blown DNS load balancing is too ambitious for now and that there are already plans for that. Currently we need to fix this bug, and the cleanest/simplest fix seems to be to make a go routine that will, every X minutes (this can be fairly large, on the order of 60 minutes or more), resolve all known LBs for all services managed by the cluster and if changed IPs have been found, will update the corresponding A records. I was told that @quinton-hoole has designed the DNS and that I should run it by him first.

Related code can be found: here, method ensureDNSRrsets and others.

irfanurrehman commented 6 years ago

Comment by quinton-hoole Monday Nov 20, 2017 at 19:21 GMT

For the record, @nikox94 and I had a slack chat, and came to a slightly different conclusion. I will leave it to him to post that update when he has time. Thanks for the contributions @nikox94 - most appreciated.

irfanurrehman commented 6 years ago

Comment by nikox94 Tuesday Nov 21, 2017 at 16:07 GMT

Hey @quinton-hoole, thank you for the welcoming atmosphere! :)

I have been a bit busy fighting some other fires this week, but have started running the tests and comprehending the code.

For the record, @quinton-hoole suggested that I instead integrate the DNS resync into the main resync loop that already exists. I did some research on AWS and it seems they will not offer notifications when the LBs change IPs. I have also read through this file and its tests. I am currently running the tests, and will want to do a bit of refactoring on them and add some tests that test the current bug if possible. Then I'll add the DNS refresh code to the control loop and submit the PR.

If anybody feels there is something to add - please go ahead. :) It's my first K8s PR, so please @quinton-hoole, forgive me for any involuntary deficiencies.

irfanurrehman commented 6 years ago

Comment by nikox94 Sunday Nov 26, 2017 at 09:52 GMT

My WIP merge request can be found here: https://github.com/kubernetes/federation/pull/155 It's not yet ready for final review. A lot of work left still.

irfanurrehman commented 6 years ago

cc @juntagabor

irfanurrehman commented 6 years ago

@nikox94, is #155 enough to resolve this issue? cc @quinton-hoole

nikox94 commented 6 years ago

Nope, it was just a step towards resolving it. There is work left still.

shashidharatd commented 6 years ago

/kind bug

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

kubernetes-retired / federation

Federated service IP out of sync between AWS LB and CloudDNS records #203