kubernetes / kubeadm

Aggregator for issues filed against kubeadm
Apache License 2.0
3.76k stars 716 forks source link

Multi master and private network #1647

Closed qw1mb0 closed 5 years ago

qw1mb0 commented 5 years ago

Is this a request for help?

Yes

If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.

What keywords did you search in kubeadm issues before filing this one?

kubeadm private network, multi master specific network interface

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:37:41Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

What happened?

I have 3 nodes with default gateway through the public network and a second network interface for the internal network.

I want to install Kubernetes 1.15 and what would etcd and the whole control plane communicate with each other over a private network. I tried this: I have 3 masters:

On the first master I create this config:

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.15.0
controlPlaneEndpoint: "apiserver:6444"
networking:
  podSubnet: "10.244.0.0/16"
  serviceSubnet: "192.168.0.0/16"
controllerManager:
  extraArgs:
    bind-address: 10.135.71.30
    address: 10.135.71.30
scheduler:
  extraArgs:
    address: 10.135.71.30
apiServer:
  extraArgs:
    advertise-address: 10.135.71.30
    bind-address: 10.135.71.30
  certSANs:
  - "10.135.71.30"
  - "10.135.131.33"
  - "10.135.169.182"
  - "apiserver"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDNS:
- 192.168.0.10
---
apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 10.135.71.30
  bindPort: 6443

after which started init:

kubeadm init --config kubeadm-config.yaml --upload-certs

Init log:

[init] Using Kubernetes version: v1.15.0
[preflight] Running pre-flight checks
        [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-master-01 localhost] and IPs [10.135.71.30 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-master-01 localhost] and IPs [10.135.71.30 127.0.0.1 ::1]
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-master-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local apiserver apiserver] and IPs [192.168.0.1 10.135.71.30 10.135.71.30 10.135.131.33 10.135.169.182]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.503559 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
05e67d8a660f265a37350eee78d31a3a3deb417a62d9ba2fec4810a6c728fac2
[mark-control-plane] Marking the node kube-master-01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kube-master-01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: jrbg26.4pxwtyxtiia17j1f
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join apiserver:6444 --token jrbg26.4pxwtyxtiia17j1f \
    --discovery-token-ca-cert-hash sha256:562900a8b0a420798b6501b248318214fdd3060374d6849e8573d779d054f27a \
    --experimental-control-plane --certificate-key 05e67d8a660f265a37350eee78d31a3a3deb417a62d9ba2fec4810a6c728fac2

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use 
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join apiserver:6444 --token jrbg26.4pxwtyxtiia17j1f \
    --discovery-token-ca-cert-hash sha256:562900a8b0a420798b6501b248318214fdd3060374d6849e8573d779d054f27a 

I see that everything correctly appeared on the necessary network interface:

root@kube-master-01:~# ss -tlpn | grep LISTEN | grep 'kube-proxy\|kubelet\|kube-scheduler\|kube-apiserver\|etcd\|kube-controller\|haproxy'
LISTEN   0         128               127.0.0.1:10249            0.0.0.0:*        users:(("kube-proxy",pid=10426,fd=11))                                         
LISTEN   0         128            10.135.71.30:10250            0.0.0.0:*        users:(("kubelet",pid=9678,fd=30))                                             
LISTEN   0         128            10.135.71.30:10251            0.0.0.0:*        users:(("kube-scheduler",pid=10121,fd=3))                                      
LISTEN   0         128            10.135.71.30:6443             0.0.0.0:*        users:(("kube-apiserver",pid=10113,fd=3))                                      
LISTEN   0         128               127.0.0.1:2379             0.0.0.0:*        users:(("etcd",pid=10046,fd=6))                                                
LISTEN   0         128            10.135.71.30:2379             0.0.0.0:*        users:(("etcd",pid=10046,fd=5))                                                
LISTEN   0         128            10.135.71.30:10252            0.0.0.0:*        users:(("kube-controller",pid=10048,fd=3))                                     
LISTEN   0         128            10.135.71.30:2380             0.0.0.0:*        users:(("etcd",pid=10046,fd=3))                                                
LISTEN   0         128            10.135.71.30:6444             0.0.0.0:*        users:(("haproxy",pid=1007,fd=7))                                              
LISTEN   0         128            10.135.71.30:10257            0.0.0.0:*        users:(("kube-controller",pid=10048,fd=5))                                     
LISTEN   0         128               127.0.0.1:10259            0.0.0.0:*        users:(("kube-scheduler",pid=10121,fd=5))                                      
LISTEN   0         128               127.0.0.1:44083            0.0.0.0:*        users:(("kubelet",pid=9678,fd=7))                                              
LISTEN   0         128               127.0.0.1:10248            0.0.0.0:*        users:(("kubelet",pid=9678,fd=27))                                             
LISTEN   0         128                       *:10256                  *:*        users:(("kube-proxy",pid=10426,fd=10))  

Then I tried to join the second master with command:

  kubeadm join apiserver:6444 --token jrbg26.4pxwtyxtiia17j1f \
    --discovery-token-ca-cert-hash sha256:562900a8b0a420798b6501b248318214fdd3060374d6849e8573d779d054f27a \
    --experimental-control-plane --certificate-key 05e67d8a660f265a37350eee78d31a3a3deb417a62d9ba2fec4810a6c728fac2 --apiserver-advertise-address 10.135.131.33

kubeadm join log output:

Flag --experimental-control-plane has been deprecated, use --control-plane instead
[preflight] Running pre-flight checks
        [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-master-02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local apiserver apiserver] and IPs [192.168.0.1 10.135.131.33 10.135.71.30 10.135.131.33 10.135.169.182]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-master-02 localhost] and IPs [10.135.131.33 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-master-02 localhost] and IPs [10.135.131.33 127.0.0.1 ::1]
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node kube-master-02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kube-master-02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

I see that everything correctly appeared on the necessary network interface:

root@kube-master-02:~#  ss -tlpn | grep LISTEN | grep 'kube-proxy\|kubelet\|kube-scheduler\|kube-apiserver\|etcd\|kube-controller\|haproxy'
LISTEN   0         128               127.0.0.1:10249            0.0.0.0:*        users:(("kube-proxy",pid=5774,fd=12))                                          
LISTEN   0         128           10.135.131.33:10250            0.0.0.0:*        users:(("kubelet",pid=5350,fd=25))                                             
LISTEN   0         128               127.0.0.1:2379             0.0.0.0:*        users:(("etcd",pid=6486,fd=6))                                                 
LISTEN   0         128           10.135.131.33:2379             0.0.0.0:*        users:(("etcd",pid=6486,fd=5))                                                 
LISTEN   0         128           10.135.131.33:2380             0.0.0.0:*        users:(("etcd",pid=6486,fd=3))                                                 
LISTEN   0         128           10.135.131.33:6444             0.0.0.0:*        users:(("haproxy",pid=1046,fd=7))                                              
LISTEN   0         128               127.0.0.1:35799            0.0.0.0:*        users:(("kubelet",pid=5350,fd=10))                                             
LISTEN   0         128               127.0.0.1:10248            0.0.0.0:*        users:(("kubelet",pid=5350,fd=26))                                             
LISTEN   0         128                       *:10256                  *:*        users:(("kube-proxy",pid=5774,fd=10))  

It all works out as correct, but only the first master is on the endpoint list:

# kubectl get ep
NAME         ENDPOINTS           AGE
kubernetes   10.135.71.30:6443   2m16s

In apiserver on kube-master-02 logs i see error:

error: failed to create listener: failed to listen on 10.135.71.30:6443: listen tcp 10.135.71.30:6443: bind: cannot assign requested address

If i before join second master edit configmap:

kubectl -n kube-system edit configmap kubeadm-config

And replace IP-address for:

Then I will join the second master:

  kubeadm join apiserver:6444 --token c5l7yy.orzgeibdni1iuin6 \
    --discovery-token-ca-cert-hash sha256:9890d3943d7883de7d62883b1390a78fa687d499a72cd0b814a28fa7ce931e14 \
    --experimental-control-plane --certificate-key 1e39759da744b533d322f660a110b4a75d5944effc6b1aa07080ff8b0226c059 --apiserver-advertise-address 10.135.131.33

Everything starts to work correctly. How much is the right way?

What you expected to happen?

neolit123 commented 5 years ago

kubeadm is supposed to use the filed ControlPlaneEndpoint in ClusterConfiguration which should be a load balancer in front of your API servers. so you need to setup the topology correctly first. you then join seconday control plane nodes to this LB ip:port.

by hardcoding a bind address in the ClusterConfiguration you will get

error: failed to create listener: failed to listen on 10.135.71.30:6443: listen tcp 10.135.71.30:6443: bind: cannot assign requested address

on the secondary CP nodes, because they will try to bind to the same address as the already existing control plane node.

And replace IP-address for: ClusterConfiguration.apiServer.extraArgs.advertise-address

this is still not how it should work, but what value do you replace it to, before joining the second CP node?

SataQiu commented 5 years ago

If you want to specify the bind-address of the newly added control-plane node, you can try kubeadm join --config <KUBEADM-CONFIG FILE>. @qw1mb0

Maybe we should ignore the bind-address information in kubeadm-config ConfigMap for new joined control-plane node ? @neolit123 Like this :

// fetchInitConfiguration reads the cluster configuration from the kubeadm-admin configMap
func fetchInitConfiguration(tlsBootstrapCfg *clientcmdapi.Config) (*kubeadmapi.InitConfiguration, error) {
    // creates a client to access the cluster using the bootstrap token identity
    tlsClient, err := kubeconfigutil.ToClientSet(tlsBootstrapCfg)
    if err != nil {
        return nil, errors.Wrap(err, "unable to access the cluster")
    }

    // Fetches the init configuration
    initConfiguration, err := configutil.FetchInitConfigurationFromCluster(tlsClient, os.Stdout, "preflight", true)
    if err != nil {
        return nil, errors.Wrap(err, "unable to fetch the kubeadm-config ConfigMap")
    }

    // ignore the bind-address info in `kubeadm-config` ConfigMap for new joined control-plane node
    delete(initConfiguration.ClusterConfiguration.APIServer.ExtraArgs, "advertise-address")
    delete(initConfiguration.ClusterConfiguration.APIServer.ExtraArgs, "bind-address")
    delete(initConfiguration.ClusterConfiguration.ControllerManager.ExtraArgs, "address")
    delete(initConfiguration.ClusterConfiguration.ControllerManager.ExtraArgs, "bind-address")
    delete(initConfiguration.ClusterConfiguration.Scheduler.ExtraArgs, "address")
    delete(initConfiguration.ClusterConfiguration.Scheduler.ExtraArgs, "bind-address")

    return initConfiguration, nil
}

What do you think about it?

SataQiu commented 5 years ago

/assign

qw1mb0 commented 5 years ago

this is still not how it should work, but what value do you replace it to, before joining the second CP node?

I replaced the addresses of the first master with the address of the second master

If you want to specify the bind-address of the newly added control-plane node, you can try kubeadm join --config . @qw1mb0

Do we have an example of a config for a join control plane?

Maybe we should ignore the bind-address information in kubeadm-config ConfigMap for new joined control-plane node ? @neolit123

This is a good idea.

neolit123 commented 5 years ago

you still have to use a load-balancer. please have a look here https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/

neolit123 commented 5 years ago

Maybe we should ignore the bind-address information in kubeadm-config ConfigMap for new joined control-plane node ? @neolit123

perhaps, but i don't think we should delete. commented on the PR.

neolit123 commented 5 years ago

closing as related to user setup and not a kubeadm bug.

please see the comment i made here, for further details on controlPlaneEndpoint: https://github.com/kubernetes/kubeadm/issues/1611#issuecomment-517942714