sustainable-computing-io / kepler-operator

Kepler Operator
Apache License 2.0
25 stars 26 forks source link

make cluster-up failed due to DNS #313

Open jichenjc opened 8 months ago

jichenjc commented 8 months ago

looks like this is different to https://github.com/sustainable-computing-io/kepler-operator/issues/260 I believe it's due to DNS settings and the worker node is not able to resolve kind-control-plane can someone help proivde some suggestions?

Command Output: I1124 00:07:53.844056     175 join.go:412] [preflight] found NodeName empty; using OS hostname as NodeName
I1124 00:07:53.844127     175 joinconfiguration.go:76] loading configuration from "/kind/kubeadm.conf"
I1124 00:07:53.845320     175 controlplaneprepare.go:225] [download-certs] Skipping certs download
I1124 00:07:53.845351     175 join.go:529] [preflight] Discovering cluster-info
I1124 00:07:53.845679     175 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "kind-control-plane:6443"
I1124 00:07:53.848887     175 round_trippers.go:553] GET https://kind-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s  in 1 milliseconds
I1124 00:07:53.848946     175 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://kind-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp: lookup kind-control-plane on 9.20.136.11:53: no such host
jichenjc commented 8 months ago

@sthaha any insight on this or suggestions? Thanks

vprashar2929 commented 8 months ago

@jichenjc Are you using docker or podman? From the logs it looks similar to this issue: https://github.com/kubernetes-sigs/kind/issues/3412

jichenjc commented 8 months ago

I am using docker.. is it need to be podman ?

vprashar2929 commented 8 months ago

No, the issue is with podman. With docker it should work🤔

sthaha commented 8 months ago

@jichenjc, does make cluster-up work for you from the kepler repo?

jichenjc commented 8 months ago

um.. seems not , I got exactly same issue , so definitely something need to be updated I always use kind create one k8s node before but with operator I have to follow the make cluster-up way.. so first time see this problem

sthaha commented 8 months ago

kind create should work for operator as well but a lot of things that the cluster-up does will have to manually done.

jichenjc commented 8 months ago

I am stucking at 2nd kind creation (because the kind is not able to join the first kind ) due to DNS setup kind-control-plane is not able to be resolved in the worker node's DNS , it might need some update ..

I1201 03:22:51.209370     174 round_trippers.go:553] GET https://kind-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s  in 2 milliseconds
I1201 03:22:51.209458     174 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://kind-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp: lookup kind-control-plane on 9.20.136.11:53: no such host

um.. seems not , I got exactly same issue , so definitely something need to be updated I always use kind create one k8s node before but with operator I have to follow the make cluster-up way.. so first time see this problem

update on this , I used wrong folder, in kepler folder if you run make cluster-up only one kind pod create and it is ok to run kepler functions there ..