noobaa / noobaa-operator

Operator for NooBaa - object data service for hybrid and multi cloud environments :cloud: :wrench:
https://www.noobaa.io
Apache License 2.0
103 stars 99 forks source link

cluster.local instead of real cluster name #963

Open CalmVibes opened 2 years ago

CalmVibes commented 2 years ago

Environment info

Actual behavior

  1. Noobaa Operator uses 'cluster.local' cluser name instead of real name.
    time="2022-07-20T08:56:41Z" level=info msg="RPC: Connecting websocket (0xc0013b38c0) &{RPC:0xc0003bb680 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
    time="2022-07-20T08:56:41Z" level=error msg="RPC: closing connection (0xc0013b38c0) &{RPC:0xc0003bb680 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
    time="2022-07-20T08:56:41Z" level=warning msg="RPC: RemoveConnection wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ current=0xc0013b38c0 conn=0xc0013b38c0"
    time="2022-07-20T08:56:41Z" level=error msg="RPC: Reconnect - got error: failed to websocket dial: failed to send handshake request: Get \"https://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/\": dial tcp: lookup noobaa-mgmt.noobaa.svc.cluster.local on 169.254.25.10:53: no such host"

Expected behavior

  1. Noobaa Operator uses real cluster name from KUBECONFIG.

Steps to reproduce

  1. export KUBECONFIG=/root/.kube/config
  2. noobaa install --db-volume-size-gb=5 --disable-load-balancer=true --kubeconfig='/root/.kube/config' --namespace='noobaa'
  3. grep name /root/.kube/config name: minio.cluster
  4. 'cluster.local' in the addresses of svc in logs

More information - Screenshots / Logs / Other output

guymguym commented 2 years ago

@hohoqq Thank you for reporting your experience! These addresses are used for internal cluster networking, not external, so the cluster name has nothing to do with those. See https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/. I hope this clarifies things.

CalmVibes commented 2 years ago

@guymguym We see in the logs an error connecting to endpoints. For this reason, the entire Noobaa installation does not work.

time="2022-07-20T08:56:41Z" level=info msg="RPC: Connecting websocket (0xc0013b38c0) &{RPC:0xc0003bb680 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
time="2022-07-20T08:56:41Z" level=error msg="RPC: closing connection (0xc0013b38c0) &{RPC:0xc0003bb680 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
time="2022-07-20T08:56:41Z" level=warning msg="RPC: RemoveConnection wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ current=0xc0013b38c0 conn=0xc0013b38c0"
time="2022-07-20T08:56:41Z" level=error msg="RPC: Reconnect - got error: failed to websocket dial: failed to send handshake request: Get \"https://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/\": dial tcp: lookup noobaa-mgmt.noobaa.svc.cluster.local on 169.254.25.10:53: no such host"

When I redeployed the K8s cluster with the default name "cluster.local" this issue was resolved.

guymguym commented 2 years ago

@hohoqq I don't think it should be related to the cluster name. The <name>.<namespace>.svc.cluster.local is expected to be available on any kubernetes for internal clients. You can try to look at the service status:

kubectl describe service noobaa-mgmt -n noobaa

or the pods status:

kubectl get pod -n noobaa

or use the cli status:

noobaa status -n noobaa