Closed petergarbers closed 7 years ago
@justinsb @chrislovecnm I'm seeing a ton of reports with this issue in the past week on Slack. Anything changed recently that would cause this?
I feel like I should mention that mine resolved itself after ~20 minutes but I'm leaving this open as other people are still seeing the issue
I experienced this when I didn't have my NS records publishing the subdomain I was using. When the NS records were added the api records were created.
@shrabok-surge can you be more specific? Maybe even use the example from http://kubernetes.io/docs/getting-started-guides/kops/
When you ref subdomain are you saying useast1 or dev in useast1.dev.example.com ?
checkout out the terraform output there are no route 53 resources. how is the cluster subdomains suppose to be created?
@petergarbers From a machine within the same VPC, are you able to resolve your domain : dig ns clustername.domain.com
?
@CliMz I have the same problem (no api domain names, and do have etcd names). I successfully resolved dig ns clustername.domain.com
from a machine within the same VPC
I waited over an our but still no api domain :-/
I ssh'd into the admin node and in the /var/log/kube-apiserver.log
i see:
controller.go:88] Unable to perform initial IP allocation check: unable to refresh the service IP block: client: etcd cluster is unavailable or misconfigured
Is this relevant?
@cyberroadie please add you install command, kops version, aws region.
Solved: Found out what the problem was: misconfiguration of the DNS subdomain. Logging in into the master node and looking in the /var/log/etcd.log file I could see the region.dev.xx.xx domain didn't get resolved. This prevented the etcd server from starting and subsequently prevented the api server from starting because it couldn't connect to the etcd cluster.
@cyberroadie is this a bug or is this a subdomain you configured for for the cluster? For example, I have done exactly has described in http://kubernetes.io/docs/getting-started-guides/kops/ but have the same symptoms you have had.
@tomdavidson It's not a bug. I made a mistake setting up the subdomain in route53
@cyberroadie We are not the only ones so maybe we are all making the same mistake. Is there an issue with the steps in http://kubernetes.io/docs/getting-started-guides/kops/ ?
I can describe to steps I took: I was creating a test run for setting up a kubernetes cluster. As described, if your not in control of the main domain e.g. testing.net
you can create a hosted zone for the subdomain e.g. dev.testing.net
. This will be the case in a future project. But now as a test I added two hosted zones in route53 with a domain I control myself. One was for testing.net
and another for dev.testing.net
. This didn't work. resolving it with dig ns dev.testing.net
returned the dns server of testing.net
and couldn't find dev.testing.net
. So for the test I dropped the dev.testing.net
hosted zone and let everything add to the 'testing.net' hosted zone. I gave priority to testing the cluster first; I still have to figure out how to do the sub domain hosted zone. Re-reading the documentation now, have to say I'm slightly puzzled how to setup the NS records correctly in this scenario.
PS the etcd domain names where added to the dev.testing.net
hosted zone
Yes, Im conused about the NS too. In my case I deletated a zone to r53 - c.b.a.edu. Then with kops create I used a name such as tom.c.b.a.edu. etcd records were created for tom.c.b.a.edu but nothing else.
Can I get a status on where this issue is at? Not following the comments ;)
@chrislovecnm I have not been able to confirm the NS is configured as needed by kops. I have done excatly as discribed in http://kubernetes.io/docs/getting-started-guides/kops/ but the direction is not the clear.
This is potentially all user error / unclear docs but until we can clarify the needed config we can not verify it is soley user error.
Take a look at this http://blog.couchbase.com/2016/november/multimaster-kubernetes-cluster-amazon-kops
We have an issue to drop something like this into our docs
I'm going to have a play around this weekend to see if I can create more clarity. For the project I'm currently working on it would be ideal if every developer team has control over it's own subdomain and the main domain is controlled separately. (Both in route53)
Test scenario:
So far we know:
Acceptance criteria
Outcome:
PS You can find me on slack: #kubernetes-users
Deleted my Route53 zones and created new ones. This time the api record was created. FYI the default limit on a new aws account kept my autoscale group from populating - encase there is a common problems section in the new docs.
@tomdavidson there is a troubleshooting guide, feel free to update
Update:
So I did my test scenario and here are the results:
I will use example.com
as an example domain:
1) Created two separate hosted zones in AWS Route53:
One for example.com
and one for dev.example.com
2) Setting up 'route delegation': (this is the proper name for it)
Copy the 4 nameservers from the NS record of dev.example.com
and create a new NS record in example.com
and add these subdomain nameservers to it. Give dev.example.com
as the name of the new NS record. After this is done the parent domain (example.com
) will delegate all request for *.dev.example.com
to the correct hosted zone in route53. Also if you create a new domain (e.g. test.dev.example.com
) via AWS route53 command line tool., it will be added to the subdomain hosted zone.
3) After this you can setup Kubernetes (with Kops) and the new domain names (for etcd, etcd-event, etc, ...) will be added to the dev.example.com
hosted zone.
The advantage of this is that you can hand over the control of a subdomain to another team without losing control over your parent domain.
Regarding the documentation I think it would be good to add 'delegating DNS request to a subdomain' and that in order to do that you have to create a seperate NS record in the parent domain hosted zone with the nameservers of the hosted zone of the subdomain.
One observation: with a setup like this all new subdomains where added almost instantly, I never had to wait more that a minute to see them appear in the subdomain hosted zone.
Set out to make a PR to fix this. 27 bot-mails, 20 mins of signups, email confirmations, linking accounts, and a contract that needed my address. I've declined to work that hard to give you free work as a PR.
You may find this interesting on configuring Route53 subdomains
So, I had this problem, and I can verify my root cause and the fix:
What happened: I followed this and everything worked up until:
$ kubectl get nodes
Unable to connect to the server: dial tcp: lookup api.cluster.stage.example.io on 127.0.1.1:53: no such host
Why? The api record wasn't configuring as described above.
Root cause: My DNS wasn't configured correctly. I had a parent domain example.io
and a subdomain stage.example.io
Fix: Add a NS record to the parent domain for the subdomain, with the subdomains NS servers as described in the article above.
Thanks for the awesome tool :-)
For people coming back to this issue: We did everything that @MichaelJCole did in advance of creating our clusters (ie: created NS records with the sub-domain NSs in the root domain hosted zone), and it still took about 20 mins for everything to come up.
It took a good while for the api* routes to be created, and even then took a while for the DNS records to propagate. kubectl get nodes
was returning no such host
all the time, then was successful 1 in every 4 times (note: there were 4 name servers), then worked more regularly, then eventually worked every time.
So, be aware:
Glad to see I am not the only one with this issue. It did take a while.
How does it create the route53 record? I ran kops --target terraform
followed by terraform apply
, so everything was created via terraform. Yet there is no route53 resource anywhere in kubernetes.tf
or the data/
dir.
Masters manage the DNS - TF wouldn't know how to manage IPs of instances that may get replaced.
@jaygorrell the k8s master creates the route53 entry for api.subdom.mydomain.com
? I thought that is set up at creation time by terraform? Is that why it sometimes takes a while, as opposed to being immediate (which would happen if terraform had done it)?
I begin to understand. :-)
2 questions:
Service
with type=LoadBalancer
?Terraform wouldn't know the IP before it's assigned to those instances and it api is a RR list of each IP - not a CNAME to an ELB or anything.
Terraform wouldn't know the IP before it's assigned to those instances
Got that. It was the api.
address that I was confused about.
api is a RR list of each IP
each IP? Isn't it a single one? Oh, you mean multiple masters? OK, got that. Makes sense.
There's a little bit on that here Thanks, that does help.
That's exactly what https://github.com/Vungle/kube-route53 does Does it? I am looking specifically for the ELB when service is
type=NodeBalancer
. It is that service that is most likely to be exposed to the outside world.
Sorry, I failed at Google. Meant to link this one: https://github.com/wearemolecule/route53-kubernetes
Oooh, now that is interesting. Thanks @jaygorrell!
@jaygorrell that is also a lot of what dnscontroller does :) You already have that installed on a kops
cluster.
Ah yes - didn't realize there was a kops release a few days ago... been waiting on that one!
So this should work now, yes? https://github.com/kubernetes/kops/tree/master/dns-controller
Damn! I accidentally clicked close on this tab and it lost my comment. Don't know what GitHub does to prevent the browser from recognizing that there is entered text, but it is not wise!
OK, recreating:
As far as I can tell, it looks like:
CNAME
for Service
of type=LoadBalancer
to point to the auto-generated hostname of the ELB.A
records, one for each node on which the given Service
has a Pod
running, which is useful for a Service
of type=NodePort
. Is that right? If so, I would love to try it.
@deitch looking to see if there is an issue open for better documentation.
Thanks @chrislovecnm
https://github.com/kubernetes/kops/issues/1230 <- lets talk there
Initially same issue here. What it boils down to is something I'm betting a LOT of us overlooked. When you create your initial Hosted Zone for your sub-domain, you get a DIFFERENT SET of NS records from AWS than what your Parent domain has. I initially copy and pasted in the same NS records from my Parent domain into my subdomain's NS record in the Parent hosted zone. Then I deleted, and copy and pasted the DIFFERENT NS records from my sub-domain hosted zone into the parent Hosted Zone for my sub-domain name NS record. Fixed the missing records instantly. Just re-ran the update cluster --yes and voila!
So can we close this?
Please do On Mon, Jan 2, 2017 at 19:45 Chris Love notifications@github.com wrote:
So can we close this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kops/issues/859#issuecomment-270037270, or mute the thread https://github.com/notifications/unsubscribe-auth/AANxzPr-DDYQSwpJA6JDQTrBMRrXW_lnks5rOZpCgaJpZM4KuDqr .
Check if the kops version is 1.5+, we need not to define the --dns-zone=experimental.com
I also got the same error bit after removing this --dns-zone=experimental.com, it works just define the cluster name thats all.
For an example: kops create cluster --name=kops-k8s-expt --state=s3://kops-k8s-experimental --zones=us-east-1a --node-count=2 --node-size=t2.micro --master-size=t2.micro
--target=terraform
also requires --dns-zone
.
$ kops version
Version 1.7.0
$ terraform --version
Terraform v0.10.6
Not sure if this is still relevant to anyone, but since I ran into the same issue twice while following the getting started guide, I'll leave it here.
The issue with being unable to resolve Kubernetes cluster API URL
popped up in my case with just a parent domain and what looked like a properly configured api record; no sub-domains. I'm trying to validate the cluster from my local machine and I am using a Public Hosted Zone.
After several failed attempts, I ran a quick test in Route53 for the api record (go to hosted zone > api record > test record set) using my public IP as the resolver IP address. Running kops validate cluster
immediately after this returned a valid cluster response. I'll note that it may simply have been a coincidence and enough time had passed for the issue to just resolve itself as @petergarbers mentioned above, but if not, and others run into this, give it a shot.
I’m trying to setup a cluster in a new aws account using kops following this guide
I have noticed that 2 records aren’t being created.
api.clustername.domain.com
andapi.internal.clustername.domain.com
. I only have the domain records for the etcd service.As a result I am unable to connect to my cluster using kubectl. From what I can tell the master and the other nodes are running, however, manually creating these domain records has been unfruitful. So I suspect there may be other issues.