kube-proxy doesn't find master with private networks on Kubespray 2.15

daugustin commented 3 years ago

Environment:

Cloud provider or hardware configuration: Hetzner Cloud
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"): Linux 5.4.0-62-generic x86_64 NAME="Ubuntu" VERSION="20.04.1 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.1 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal
Version of Ansible (ansible --version): ansible 2.9.16
Version of Python (python --version): Python 3.7.8

Kubespray version (commit) (git rev-parse --short HEAD): a923f4e7

Network plugin used: weave

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

[...] ansible_host: MY_PUBLIC_IP ip: 172.20.0.2 access_ip: 172.20.0.2 etcd_access_address: 172.20.0.2

I'm using private networks for cluster communication. By setting ip, access_ip and etcd_access_address to the private network ip, almost all daemons bind to the internal ip.

This worked absolutely fine in Kubespray 2.14/K8s 1.18.

I upgraded to 2.15/1.19 and this fails now:

The kube-proxy log on the master-node now says:

E0118 17:36:59.998461 1 node.go:125] Failed to retrieve node info: Get "https://127.0.0.1:6443/api/v1/nodes/master-1": dial tcp 127.0.0.1:6443: connect: connection refused

This is expected, as the master only listens on the private nic:

root@master-1:~# ss -ltnp|grep 6443 LISTEN 0 4096 172.20.0.2:6443 0.0.0.0:* users:(("kube-apiserver",pid=774,fd=7))

This might be a result of kubernetes/kubernetes#83822

When manually editing the DS of kube-proxy and adding

'--master=https://172.20.0.2:6443'

to the parameters of kube-proxy everything starts up again.

So, is this a config issue on my side? How can I add the master setting in the DS via kubespray? I did not find anything about that.

I'm assuming because of this new fallback behaviour we need a setting / template variable to make this configurable?

jvleminc commented 3 years ago

We seem to have stumbled upon the same issue with 2.15/1.19 and indeed, this worked absolutely fine in Kubespray 2.14/K8s 1.18.

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

daugustin commented 3 years ago

/remove-lifecycle stale

Still an issue.

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

daugustin commented 3 years ago

/remove-lifecycle stale

Still an issue.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kubespray/issues/7180#issuecomment-1012518760): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / kubespray

kube-proxy doesn't find master with private networks on Kubespray 2.15 #7180