Closed oz123 closed 3 years ago
By the looks of it, it seems that the etcd cluster isn't forming because masters cannot talk to each other over gossip. Would you be able to check that masters can reach each other?
I think you are correct. Previously when I installed k8s on Openstack I had to allow communication with IP protocol 4 (IP over IP). I don't know how to do that with kops. Can I add my own custom security rules ?
It could be a firewall rule is missing from Openstack. You can have a look at what kops does here: https://github.com/kubernetes/kops/blob/master/pkg/model/openstackmodel/firewall.go#L402.
I am not 100% sure what Calico needs. But I had to recently add vxlan for cilium for the openstack setup I am using.
calico needs that protocol 4, and kops should add that automatically if you use calico. However, etcd is not using calico. Instead it uses host networking, so it should not matter. You could go through the different masters and etcd logs. Please note that there are two etcd clusters: main and event. So each master have two etcd pods, you could check the logs of them - is there anything interesting?
I ran into a similar issue as the above, and in my instance the etcd logs (of both etcd clusters) listed only the local deployment as existing, so for whatever reason they couldn't see each other.
I haven't invested much time in debugging further, as my installation was also unsupported (I believe) lacking any sort of LBaaS or DNS resolution in OpenStack.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten
I don't think this is stale, how's the maintainership of the OpSt code?
Best effort :)
OpenStack comes in so many different forms that it is not as easy to maintain as public cloud providers. I am pretty sure the original issue was solved in 1.19 though.
Can you create a new issue with the errors you are seeing?
/close
@olemarkus: Closing this issue.
**1. What
kops
version are you running? The commandkops version
, will display2. What Kubernetes version are you running?
kubectl version
will print the version if a cluster is running or provide the Kubernetes version specified as akops
flag.3. What cloud provider are you using? Openstack
4. What commands did you run? What is the simplest way to reproduce this issue?
I edited the subnetworks for utilities:
Than I did:
5. What happened after the commands executed? The infrastructure was provisioned, but the cluster failed to start. All components start, except for etcd cluster and the api server:
The logs of etcd:
6. What did you expect to happen? The cluster starts.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest. You may want to remove your cluster name and other sensitive information.9. Anything else do we need to know?
I am using the ubuntu focal cloud image.