vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
vcluster installs correctly and most pods start correctly, however some fail to resolve DNS addresses correctly, both internal and external ones. Even the coredns pod will sometimes be unable to start properly because it can't connect to the kubernetes api.
Sometimes pods will start working if I repeatedly delete pods to restart them, particularly the core dns pods.
2024/08/22 23:24:07 [emerg] 1#0: host not found in resolver "kube-dns.kube-system.svc.cluster.local" in /opt/nginx/con
│ nginx: [emerg] host not found in resolver "kube-dns.kube-system.svc.cluster.local" in /opt/nginx/conf/nginx.conf:13
Fluentd:
│ 2024-08-22 23:22:28 +0000 [info]: Received graceful stop
│ W, [2024-08-22T23:22:46.715401 #14] WARN -- #<Bunny::Session:0x1338 fluentd@XXXXXXXX, vhost=/, addresses=[XXXXXXX]>: Could not establish TCP connecti
│ 2024-08-22 23:22:46 +0000 [error]: #0 unexpected error error_class=Bunny::TCPConnectionFailedForAllHosts error="Could not establish TCP connection to any of the configured hosts"
│ 2024-08-22 23:22:46 +0000 [error]: #0 /usr/lib/ruby/gems/3.2.0/gems/bunny-2.14.4/lib/bunny/session.rb:338:in `rescue in start'
What did you expect to happen?
Pods should not have errors connecting to internal and external.
How can we reproduce it (as minimally and precisely as possible)?
Unfortunately there isn't a public way of deploying this, I'll see what I can do in terms of recreating it externally.
Anything else we need to know?
No response
Host cluster Kubernetes version
```console
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.12-eks-2f46c53
WARNING: version difference between client (1.30) and server (1.28) exceeds the supported minor version skew of +/-1
```
What happened?
vcluster installs correctly and most pods start correctly, however some fail to resolve DNS addresses correctly, both internal and external ones. Even the coredns pod will sometimes be unable to start properly because it can't connect to the kubernetes api.
Sometimes pods will start working if I repeatedly delete pods to restart them, particularly the core dns pods.
The host cluster has calico installed and we've run into DNS issues that seem similar before, see https://github.com/projectcalico/calico/issues/4955 for what I think was happening in that case.
Example errors from pods:
Nginx server:
Fluentd:
What did you expect to happen?
Pods should not have errors connecting to internal and external.
How can we reproduce it (as minimally and precisely as possible)?
Unfortunately there isn't a public way of deploying this, I'll see what I can do in terms of recreating it externally.
Anything else we need to know?
No response
Host cluster Kubernetes version
vcluster version
VCluster Config