Closed tcurdt closed 5 years ago
i cannot reproduce this problem on a Ubuntu 17.10 x86_64 setup:
sudo kubeadm reset
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
execution continues fine after writing this file:
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
...
kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3-beta.0", GitCommit:"c6d339953bd4fd8c021a6b5fb46d7952b30be9f9", GitTreeState:"clean", BuildDate:"2019-02-02T02:09:01Z", GoVersion:"go1.11.1", Compiler:"gc", Platform:"linux/amd64"}
init succeeds:
...
You can now join any number of machines by running the following on each node
as root:
kubeadm join 192.168.0.102:6443 --token gzdxwi.7kvz73t7xjphd8fa --discovery-token-ca-cert-hash sha256:bb2a181b6cebd44482fb14a061acd7c2f3984425be0f5d5a52527ccda70aa0d3
some questions:
please note that we don't have test signal for ARM, so the support is experimental.
Hmm. What's the content of the file for you?
As for the questions:
What's the content of the file for you?
a properly populated kubeconfig file for the scheduler.
I am going through the code but so far I just don't see how it could fail writing after https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go#L93
Hmmmmmmmm. It worked on the other RPi
$ sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
$ sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo kubeadm init --v=1 --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
I0202 04:26:49.363801 3742 feature_gate.go:206] feature gates: &{map[]}
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
I0202 04:26:49.364763 3742 checks.go:572] validating Kubernetes and kubeadm version
I0202 04:26:49.364880 3742 checks.go:171] validating if the firewall is enabled and active
I0202 04:26:49.409686 3742 checks.go:208] validating availability of port 6443
I0202 04:26:49.410032 3742 checks.go:208] validating availability of port 10251
I0202 04:26:49.410148 3742 checks.go:208] validating availability of port 10252
I0202 04:26:49.410256 3742 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
I0202 04:26:49.410444 3742 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
I0202 04:26:49.410539 3742 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
I0202 04:26:49.410630 3742 checks.go:283] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0202 04:26:49.410671 3742 checks.go:430] validating if the connectivity type is via proxy or direct
I0202 04:26:49.410761 3742 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 04:26:49.410829 3742 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 04:26:49.410883 3742 checks.go:104] validating the container runtime
I0202 04:26:49.862102 3742 checks.go:130] validating if the service is enabled and active
I0202 04:26:49.934970 3742 checks.go:332] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0202 04:26:49.935189 3742 checks.go:332] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0202 04:26:49.935285 3742 checks.go:644] validating whether swap is enabled or not
I0202 04:26:49.935387 3742 checks.go:373] validating the presence of executable ip
I0202 04:26:49.935512 3742 checks.go:373] validating the presence of executable iptables
I0202 04:26:49.935611 3742 checks.go:373] validating the presence of executable mount
I0202 04:26:49.935767 3742 checks.go:373] validating the presence of executable nsenter
I0202 04:26:49.935893 3742 checks.go:373] validating the presence of executable ebtables
I0202 04:26:49.936003 3742 checks.go:373] validating the presence of executable ethtool
I0202 04:26:49.936107 3742 checks.go:373] validating the presence of executable socat
I0202 04:26:49.936196 3742 checks.go:373] validating the presence of executable tc
I0202 04:26:49.936332 3742 checks.go:373] validating the presence of executable touch
I0202 04:26:49.936409 3742 checks.go:515] running all checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.04.0-ce. Latest validated version: 18.06
I0202 04:26:50.116640 3742 checks.go:403] checking whether the given node name is reachable using net.LookupHost
I0202 04:26:50.116718 3742 checks.go:613] validating kubelet version
I0202 04:26:50.434161 3742 checks.go:130] validating if the service is enabled and active
I0202 04:26:50.484312 3742 checks.go:208] validating availability of port 10250
I0202 04:26:50.484604 3742 checks.go:208] validating availability of port 2379
I0202 04:26:50.484717 3742 checks.go:208] validating availability of port 2380
I0202 04:26:50.484824 3742 checks.go:245] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0202 04:26:50.913249 3742 checks.go:839] pulling k8s.gcr.io/kube-apiserver:v1.13.3
I0202 04:27:29.700381 3742 checks.go:839] pulling k8s.gcr.io/kube-controller-manager:v1.13.3
I0202 04:27:53.618297 3742 checks.go:839] pulling k8s.gcr.io/kube-scheduler:v1.13.3
I0202 04:28:10.541033 3742 checks.go:839] pulling k8s.gcr.io/kube-proxy:v1.13.3
I0202 04:28:19.680453 3742 checks.go:839] pulling k8s.gcr.io/pause:3.1
I0202 04:28:22.317787 3742 checks.go:839] pulling k8s.gcr.io/etcd:3.2.24
I0202 04:29:12.484547 3742 checks.go:839] pulling k8s.gcr.io/coredns:1.2.6
I0202 04:29:23.426057 3742 kubelet.go:71] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
I0202 04:29:23.958235 3742 kubelet.go:89] Starting the kubelet
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0202 04:29:24.414430 3742 certs.go:113] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.41]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0202 04:29:39.845477 3742 certs.go:113] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
I0202 04:30:06.572764 3742 certs.go:113] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.41 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.41 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0202 04:30:40.419908 3742 certs.go:72] creating a new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0202 04:30:50.639994 3742 kubeconfig.go:92] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0202 04:30:54.436081 3742 kubeconfig.go:92] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0202 04:31:00.520396 3742 kubeconfig.go:92] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0202 04:31:04.669076 3742 kubeconfig.go:92] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0202 04:31:07.085092 3742 local.go:60] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0202 04:31:07.085194 3742 waitcontrolplane.go:89] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 93.513465 seconds
I0202 04:32:40.622238 3742 uploadconfig.go:114] [upload-config] Uploading the kubeadm ClusterConfiguration to a ConfigMap
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
I0202 04:32:40.758253 3742 uploadconfig.go:128] [upload-config] Uploading the kubelet component config to a ConfigMap
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
I0202 04:32:40.848825 3742 uploadconfig.go:133] [upload-config] Preserving the CRISocket information for the control-plane node
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "km01" as an annotation
[mark-control-plane] Marking the node km01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node km01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: f36l6v.10hbkeh7af1vtane
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
I0202 04:32:42.041777 3742 clusterinfo.go:46] [bootstraptoken] loading admin kubeconfig
I0202 04:32:42.045796 3742 clusterinfo.go:54] [bootstraptoken] copying the cluster from admin.conf to the bootstrap kubeconfig
I0202 04:32:42.047641 3742 clusterinfo.go:66] [bootstraptoken] creating/updating ConfigMap in kube-public namespace
I0202 04:32:42.058852 3742 clusterinfo.go:80] creating the RBAC rules for exposing the cluster-info ConfigMap in the kube-public namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 192.168.178.41:6443 --token f36l6v.10hbkeh7af1vtane --discovery-token-ca-cert-hash sha256:e38e0f37463915d5d30a846d4cdc0d2ea2b0abefed9c98ce692ab367555ec7ea
I'll try another fresh install on the other one.
Still the same on the other one. Crazy! The order in the log is slightly different.
$ sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
$ sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
$ sudo kubeadm init --v=1 --skip-phases=control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
I0202 05:06:45.782055 3351 feature_gate.go:206] feature gates: &{map[]}
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
I0202 05:06:45.784710 3351 checks.go:572] validating Kubernetes and kubeadm version
I0202 05:06:45.785010 3351 checks.go:171] validating if the firewall is enabled and active
I0202 05:06:45.872786 3351 checks.go:208] validating availability of port 6443
I0202 05:06:45.873764 3351 checks.go:208] validating availability of port 10251
I0202 05:06:45.874075 3351 checks.go:208] validating availability of port 10252
I0202 05:06:45.874415 3351 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
I0202 05:06:45.874950 3351 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
I0202 05:06:45.875295 3351 checks.go:283] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
I0202 05:06:45.875571 3351 checks.go:283] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0202 05:06:45.875709 3351 checks.go:430] validating if the connectivity type is via proxy or direct
I0202 05:06:45.875902 3351 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 05:06:45.876126 3351 checks.go:466] validating http connectivity to first IP address in the CIDR
I0202 05:06:45.876304 3351 checks.go:104] validating the container runtime
I0202 05:06:46.673342 3351 checks.go:130] validating if the service is enabled and active
I0202 05:06:46.811415 3351 checks.go:332] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0202 05:06:46.811814 3351 checks.go:332] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0202 05:06:46.811992 3351 checks.go:644] validating whether swap is enabled or not
I0202 05:06:46.812215 3351 checks.go:373] validating the presence of executable ip
I0202 05:06:46.812439 3351 checks.go:373] validating the presence of executable iptables
I0202 05:06:46.812607 3351 checks.go:373] validating the presence of executable mount
I0202 05:06:46.812907 3351 checks.go:373] validating the presence of executable nsenter
I0202 05:06:46.813138 3351 checks.go:373] validating the presence of executable ebtables
I0202 05:06:46.813358 3351 checks.go:373] validating the presence of executable ethtool
I0202 05:06:46.813630 3351 checks.go:373] validating the presence of executable socat
I0202 05:06:46.813826 3351 checks.go:373] validating the presence of executable tc
I0202 05:06:46.814069 3351 checks.go:373] validating the presence of executable touch
I0202 05:06:46.814219 3351 checks.go:515] running all checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.04.0-ce. Latest validated version: 18.06
I0202 05:06:47.048462 3351 checks.go:403] checking whether the given node name is reachable using net.LookupHost
I0202 05:06:47.048592 3351 checks.go:613] validating kubelet version
I0202 05:06:47.595796 3351 checks.go:130] validating if the service is enabled and active
I0202 05:06:47.651204 3351 checks.go:208] validating availability of port 10250
I0202 05:06:47.651472 3351 checks.go:208] validating availability of port 2379
I0202 05:06:47.651615 3351 checks.go:208] validating availability of port 2380
I0202 05:06:47.651727 3351 checks.go:245] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0202 05:06:48.016166 3351 checks.go:839] pulling k8s.gcr.io/kube-apiserver:v1.13.3
I0202 05:07:31.930706 3351 checks.go:839] pulling k8s.gcr.io/kube-controller-manager:v1.13.3
I0202 05:07:53.557135 3351 checks.go:839] pulling k8s.gcr.io/kube-scheduler:v1.13.3
I0202 05:08:02.157734 3351 checks.go:839] pulling k8s.gcr.io/kube-proxy:v1.13.3
I0202 05:08:12.517983 3351 checks.go:839] pulling k8s.gcr.io/pause:3.1
I0202 05:08:15.114442 3351 checks.go:839] pulling k8s.gcr.io/etcd:3.2.24
I0202 05:09:22.376566 3351 checks.go:839] pulling k8s.gcr.io/coredns:1.2.6
I0202 05:09:32.810381 3351 kubelet.go:71] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
I0202 05:09:33.227122 3351 kubelet.go:89] Starting the kubelet
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0202 05:09:33.884529 3351 certs.go:113] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [km01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.178.43]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0202 05:10:16.286602 3351 certs.go:113] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
I0202 05:10:51.727753 3351 certs.go:113] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [km01 localhost] and IPs [192.168.178.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
I0202 05:11:37.842439 3351 certs.go:72] creating a new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0202 05:11:53.841429 3351 kubeconfig.go:92] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0202 05:11:59.272573 3351 kubeconfig.go:92] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0202 05:12:04.958869 3351 kubeconfig.go:92] creating kubeconfig file for controller-manager.conf
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaab708]
goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.validateKubeConfig(0xfb953a, 0xf, 0xfc3e7a, 0x17, 0x24943f0, 0x68b, 0x7bc)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:236 +0x120
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFileIfNotExists(0xfb953a, 0xf, 0xfc3e7a, 0x17, 0x24943f0, 0x0, 0x26d0000)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:257 +0x90
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFiles(0xfb953a, 0xf, 0x27f38c0, 0x2a99c60, 0x1, 0x1, 0x0, 0x0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:120 +0xf4
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.CreateKubeConfigFile(0xfc3e7a, 0x17, 0xfb953a, 0xf, 0x27f38c0, 0x7211a101, 0xb9bfcc)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:93 +0xe8
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases.runKubeConfigFile.func1(0xf76bc8, 0x25f4820, 0x0, 0x0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/kubeconfig.go:155 +0x168
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0x26a6800, 0x0, 0x0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 +0x160
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(0x26b1270, 0x2a99d68, 0x25f4820, 0x0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:416 +0x5c
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0x26b1270, 0x24, 0x2933db4)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:208 +0xc8
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1(0x2689b80, 0x249c5d0, 0x0, 0x5)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:141 +0xfc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0x2689b80, 0x249c5a0, 0x5, 0x6, 0x2689b80, 0x249c5a0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x20c
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x2688140, 0x2689b80, 0x2688780, 0x281e3b0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x210
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0x2688140, 0x240c0d8, 0x117dec0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x1c
k8s.io/kubernetes/cmd/kubeadm/app.Run(0x2494000, 0x0)
/workspace/anago-v1.13.3-beta.0.37+721bfa751924da/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48 +0x1b0
main.main()
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29 +0x20
I tried this on centos but execution continues after [kubeconfig] and init suceeded.
@RA489 Thanks for trying - but this seems highly sensitive to the environment running in/on. It failed on a RPi3B+ but succeeded on an RPI3B.
It even worked even on the RPi3B+ when I manually execute the init in the individual init phases. (see issue #1380 )
@tcurdt @neolit123 I was not able to reproduce it in my environment.
After looking at the traces I can see the difference. It breaks here https://github.com/kubernetes/kubernetes/blob/v1.13.3/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go#L235 which means that it was able to load configuration file.
However, on my machine it always exists here https://github.com/kubernetes/kubernetes/blob/v1.13.3/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go#L224 which means that configuration file doesn't exist.
@tcurdt if you still have this issue reproducible can you do the following:
sudo kubeadm reset
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo ls /etc/kubernetes/
and show the output heresudo cat /etc/kubernetes/*.conf
and show the output here
P.S. This is very interesting issue. It would be great to find out its reason and fix it.
@bart0sh i suspect memory corruption on ARM - but this could be isolated to the RPI CPU. we've seen similar problems caused by the go compiler for ARM in older versions - e.g. illegal instructions.
i still have a plan to send one minor patch related to this.
@neolit123 it could be so, but it doesn't explain the fact that it was able to load config file, does it?
it loads a zero byte config, which is a valid operation. but from there interacting with this config is a panic.
@neolit123 where zero byte config comes from? I don't see it happening in my setup.
reproduced with empty config:
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
>>> runKubeConfigFile 2 scheduler.conf
>>> kubeConfigFileName: scheduler.conf
>>> createKubeConfigFileIfNotExists: /etc/kubernetes/scheduler.conf
>>>> validateKubeConfig: /etc/kubernetes/scheduler.conf &{ {false map[]} map[] map[] map[] map[]}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x108702b]
goroutine 1 [running]:
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.validateKubeConfig(0x1705755, 0xf, 0x1705215, 0xe, 0xc0000387e0, 0x0, 0x41e)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:240 +0x2cb
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFileIfNotExists(0x1705755, 0xf, 0x1705215, 0xe, 0xc0000387e0, 0x0, 0xc0005399b0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:261 +0x18f
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.createKubeConfigFiles(0x1705755, 0xf, 0xc0001aefc0, 0xc0005b3830, 0x1, 0x1, 0x0, 0x0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:121 +0x1f1
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.CreateKubeConfigFile(0x1705215, 0xe, 0x1705755, 0xf, 0xc0001aefc0, 0x0, 0x0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:93 +0x130
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases.runKubeConfigFile.func1(0x16bef40, 0xc0003a6510, 0x0, 0x0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/kubeconfig.go:157 +0x24f
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1(0xc000474f00, 0x0, 0x0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 +0x1e9
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll(0xc00057e510, 0xc0005b3a90, 0xc0003a6510, 0x0)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:416 +0x6e
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run(0xc00057e510, 0x1, 0x1)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:208 +0x107
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1(0xc000382000, 0xc00037db80, 0x0, 0x4)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:142 +0x1c3
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc000382000, 0xc00037db00, 0x4, 0x4, 0xc000382000, 0xc00037db00)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x2cc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000382500, 0xc000382000, 0xc0003aa000, 0xc0000c4540)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x2fd
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0xc000382500, 0xc00000c010, 0x18c9f40)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x2b
k8s.io/kubernetes/cmd/kubeadm/app.Run(0xc000038240, 0x18b)
/home/ed/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:48 +0x202
main.main()
_output/local/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:29 +0x33
@neolit123 Do you have any idea why there is an empty config in @tcurdt's setup?
@bart0sh
Do you have any idea why there is an empty config in @tcurdt's setup?
from my earlier comment:
i suspect memory corruption on ARM - but this could be isolated to the RPI CPU.
we need to reproduce and debug this on Raspberry PI. if you don't have one, just leave this issue for now. AFAIK, ARM desktop machines are not affected by this bug or at least we haven't seen reports.
@neolit123 Would it make sense to fix it by checking if config.Contexts map has the expectedCtx/currentCtx before using it?
we can actually investigate it further with @tcurdt's help and find out where those empty or broken files come from.
Would it make sense to fix it by checking if config.Contexts map has the expectedCtx/currentCtx before using it?
yes, my idea was to apply a similar change across kubeadm (AFAIK we assume a valid config like that in more than one place). but as you can understand this will not fix the problem, only the panic.
please feel free to send a patch for the above if you'd like.
@bart0sh Have a look here https://github.com/kubernetes/kubeadm/issues/1380#issuecomment-459970197
...and also read the progress leading up to it.
Unfortunately I don't have the full output of sudo ls /etc/kubernetes/
and sudo cat /etc/kubernetes/*.conf
. I would have to re-do that. But the interesting part from the initial issue was that /etc/kubernetes/scheduler.conf
had a size of 0.
Of course there is no way to rule out an ARM memory corruption - but given the fact that it is very reproducible (at least on my RPi) and it works when all phases are run separately, I'd rather bet on some kind of a race condition. I'd just wouldn't expect a memory corruption to be that reproducible. But I don't know the go compiler/runtime well enough to make an educated guess.
The time I can spend on this is limited but I am happy to help to further dig into this.
@tcurdt can you please run kubeadm trough something like delve or your debugger of choice: https://github.com/go-delve/delve
also please see if you RPI distro has valgrind and run the binary through that.
@tcurdt Yes, the issue is that after kubeadm reset
and kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
0 size conf files appear in /etc/kubernetes. In my setup I don't see them there. So, the question is when they're created? kubeadm reset
should remove all of them I believe. Can you confirm that they're created by kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
?
Works for me:
sudo kubeadm init phase control-plane all --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.86.47
sudo sed -i 's/failureThreshold: 8/failureThreshold: 20/g’ /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/initialDelaySeconds: [0-9]\+/initialDelaySeconds: 720/' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init skip-phases=control-plane --token-ttl=0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.86.47 --ignore-preflight-errors=all --dry-run
sudo cp -dpR /tmp/kubeadm-init-dryrun707341788/controller-manager.conf /etc/kubernetes/.
sudo cp -dpR /tmp/kubeadm-init-dryrun707341788/scheduler.conf /etc/kubernetes/.
sudo cp -dpR /tmp/kubeadm-init-dryrun707341788/ca.key /etc/kubernetes/pki/.
sudo cp -dpR /tmp/kubeadm-init-dryrun707341788/ca.crt /etc/kubernetes/pki/.
sudo kubeadm init skip-phases=control-plane --token-ttl=0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.86.47 --ignore-preflight-errors=all
@gbailey46 can you please also add the exact environment you ran this on. Otherwise "works for me" is not exactly helpful.
Rpi3B
pi@pi3:~ $ sudo cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
NAME="Raspbian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
pi@pi3:~ $ uname -a
Linux pi3 4.14.79-v7+ #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux
pi@pi3:~ $
@gbailey46 does it consistently work for you - e.g. multiple consecutive times?
@tcurdt what differences do you have with the above setup?
Well, it worked for me on a RPi3B, too. It did not work on a Rpi3B+.
Yes it works repeatedly. I realised that the controller-manager.conf and scheduler.conf file were 0 bytes and that the init would use an existing file (and CA) if one already existed, hence potentially skipping the file create/write that seems to segfault. On a whim I tried --dry-run and it completed without segfault. So I used the relevant CA and .conf from the --dry-run.
so it technically exhibits the same SIGSEGV behavior.
i also see the CPUs for the two boards are the same; https://www.datenreise.de/en/raspberry-pi-3b-and-3b-in-comparison/
can someone test on a non-ARM Cortex-A53 board (if there is such a RPI even)?
also we still need help with someone debugging the root of the problem.
I was out of action the past few days but it's on my todo to have another closer look.
@gbailey46
I realised that the controller-manager.conf and scheduler.conf file were 0
Can you tell when they're created? Was it after running kubeadm init phase control-plane all
?
They are created when you execute:
sudo kubeadm init skip-phases=control-plane --token-ttl=0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.86.47 --ignore-preflight-errors=all
@gbailey46 thanks. That's very interesting. Looks like a race condition to me. I don't see where in the code that could happen. will look again.
@neolit123 I was just trying to give it another shot but apparently delve
does not support ARM and gdb
seems also to give problems. I did get valgrind
installed though.
Anyone willing to pair on this via IRC/discord/whatever?
hi, we are prepping for the 1.14 release and i won't have time anytime soon. but please do post updates.
@neolit123 too bad. Someone else? Some other suggestion for a debugger?
In theory this should just work:
sudo kubeadm reset
sudo kubeadm init --pod-network-cidr 10.244.0.0/16
So shall I run this through valgrind
sudo kubeadm reset
sudo valgrind --leak-check=yes kubeadm init --pod-network-cidr 10.244.0.0/16
or is the primary objective to find when the file is created with size 0? So I run through all the phases and list the content?
if the debuggers are not very useful i would start adding fmt.Print(...) calls in a lot of places, until i find where/why that config ends up being zero.
@tcurdt
Someone else?
I'd be happy to help you with this. I'm Ed@kubernetes.slack.com
Big thanks to @bart0sh. Unfortunately we could no longer reproduce it. Neither with the current version (maybe I should have done a full re-install instead of just a reset) nor with the latest master. I hate it when bugs just "disappear" but well - closing for now. Thanks for all the help.
Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT
Versions
kubeadm version (use
kubeadm version
):Environment:
Kubernetes version (use
kubectl version
):Cloud provider or hardware configuration: RPi3B+
OS (e.g. from /etc/os-release):
Kernel (e.g.
uname -a
):Others:
What happened?
I was trying to work around the race conditions mentioned in #413 and #1380, executing the
kubeadm init
in phases. Instead it crashed on the second call toinit
.What you expected to happen?
I should see the join information.
How to reproduce it (as minimally and precisely as possible)?
Fresh install of hypriotos-rpi-v1.9.0 then:
Anything else we need to know?
This is output with the panic information
It seems like some nil checks are missing https://github.com/kubernetes/kubernetes/blob/v1.13.3/cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go#L236
But the kubeconfig file looks OK to me