Open alban opened 6 years ago
After a second attempt, it works.
I get this timeout just as @alban described, except it's reproducible every time.
$ kube-spawn start
Warning: kube-proxy could crash due to insufficient nf_conntrack hashsize.
setting nf_conntrack hashsize to 131072...
making iptables FORWARD chain defaults to ACCEPT...
new poolSize to be : 5490739200
Starting 3 nodes in cluster default ...
Waiting for machine kube-spawn-default-worker-naz6fc to start up ...
Waiting for machine kube-spawn-default-master-yz3twq to start up ...
Waiting for machine kube-spawn-default-worker-u5fu6n to start up ...
Failed to start machine kube-spawn-default-master-yz3twq: timeout waiting for "kube-spawn-default-master-yz3twq" to start
Failed to start machine kube-spawn-default-worker-naz6fc: timeout waiting for "kube-spawn-default-worker-naz6fc" to start
Failed to start cluster: starting the cluster didn't succeed
Note:
for debugging, is there any place this things logs itself into?
/dev/loop2 btrfs 40G 1.7G 39G 5% /var/lib/machines
OR
/dev/sda4 btrfs 56G 1.7G 54G 4% /var/lib/machines
- `systemd-container-238-10.git438ac26.fc28.x86_64`
- `qemu-img-2.11.2-4.fc28.x86_64`
- machinectl limit to 40G with loopback mount (as evident in the df output above too):
PoolPath=/var/lib/machines PoolUsage=1866190848 PoolLimit=42949672960
- OS: `Linux 4.18.17-200.fc28.x86_64 GNU/Linux`
ok nevermind.
all I had to do was:
create
step)/var/lib/kube-spawn/clusters
. It was an empty trail of subdirs earlier.)and it works. jeez
Seems to be related to #325.
Seems to be related to #325.
sure, except I didn't destroy it first. Got the timeout from start
as per https://github.com/kinvolk/kube-spawn/issues/282#issuecomment-437786972 (so to speak, after creating the cluster)
..then resolved issue with https://github.com/kinvolk/kube-spawn/issues/282#issuecomment-437790311
apologies if that order in step 2 of resolution comment, created a confusion.
also I can't reproduce it now. :/
To Reproduce:
ssh -i ~/.ssh/$KEY fedora@$IP
Workarounds
sudo setenforce 0
Install dependencies
sudo dnf install -y btrfs-progs git go iptables libselinux-utils polkit qemu-img systemd-container make docker mkdir go export GOPATH=$HOME/go curl -fsSL -O https://github.com/containernetworking/plugins/releases/download/v0.6.0/cni-plugins-amd64-v0.6.0.tgz sudo mkdir -p /opt/cni/bin sudo tar -C /opt/cni/bin -xvf cni-plugins-amd64-v0.6.0.tgz sudo curl -Lo /usr/local/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBERNETES_VERSION}/bin/linux/amd64/kubectl sudo chmod +x /usr/local/bin/kubectl
Compile and install
mkdir -p $GOPATH/src/github.com/kinvolk cd $GOPATH/src/github.com/kinvolk git clone https://github.com/kinvolk/kube-spawn.git cd kube-spawn/ git checkout $KUBE_SPAWN_VERSION make DOCKERIZED=n sudo make install
First attempt to use kube-spawn
cd sudo -E kube-spawn create --kubernetes-version $KUBERNETES_VERSION sudo -E kube-spawn start --nodes=3 sudo -E kube-spawn destroy
Workaround for "no space left on device": https://github.com/kinvolk/kube-spawn/issues/281
sudo umount /var/lib/machines sudo qemu-img resize -f raw /var/lib/machines.raw $((1010241024*1024)) sudo mount -t btrfs -o loop /var/lib/machines.raw /var/lib/machines sudo btrfs filesystem resize max /var/lib/machines sudo btrfs quota disable /var/lib/machines
Start kube-spawn
cd sudo -E kube-spawn create --kubernetes-version $KUBERNETES_VERSION sudo -E kube-spawn start --nodes=3
Download of https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_developer_container.bin.bz2 complete. Created new local image 'flatcar'. Operation completed successfully. Exiting. nf_conntrack module is not loaded: stat /sys/module/nf_conntrack/parameters/hashsize: no such file or directory Warning: nf_conntrack module is not loaded. loading nf_conntrack module... making iptables FORWARD chain defaults to ACCEPT... setting iptables rule to allow CNI traffic... Starting 3 nodes in cluster default ... Waiting for machine kube-spawn-default-worker-fjxan9 to start up ... Waiting for machine kube-spawn-default-master-5y7clq to start up ... Waiting for machine kube-spawn-default-worker-2ujr2f to start up ... Started kube-spawn-default-worker-2ujr2f Bootstrapping kube-spawn-default-worker-2ujr2f ... Started kube-spawn-default-master-5y7clq Bootstrapping kube-spawn-default-master-5y7clq ... Cluster "default" started Failed to start machine kube-spawn-default-worker-fjxan9: timeout waiting for "kube-spawn-default-worker-fjxan9" to start Note:
kubeadm init
can take several minutes master-5y7clq I0630 14:22:29.999557 380 feature_gate.go:230] feature gates: &{map[]} [init] using Kubernetes version: v1.11.0 [preflight] running pre-flight checks [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service' [WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist [WARNING FileExisting-crictl]: crictl not found in system path I0630 14:22:30.050775 380 kernel_validator.go:81] Validating kernel version I0630 14:22:30.051083 380 kernel_validator.go:96] Validating kernel config [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.05.0-ce. Max validated version: 17.03 [WARNING Hostname]: hostname "kube-spawn-default-master-5y7clq" could not be reached [WARNING Hostname]: hostname "kube-spawn-default-master-5y7clq" lookup kube-spawn-default-master-5y7clq on 8.8.8.8:53: no such host reflight/images] Pulling images required for setting up a Kubernetes cluster [preflight/images] This might take a minute or two, depending on the speed of your internet connection [preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [preflight] Activating the kubelet service [certificates] Generated ca certificate and key. [certificates] Generated apiserver certificate and key. [certificates] apiserver serving cert is signed for DNS names [kube-spawn-default-master-5y7clq kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.22.0.3] [certificates] Generated apiserver-kubelet-client certificate and key. [certificates] Generated sa key and public key. [certificates] Generated front-proxy-ca certificate and key. [certificates] Generated front-proxy-client certificate and key. [certificates] Generated etcd/ca certificate and key. [certificates] Generated etcd/server certificate and key. [certificates] etcd/server serving cert is signed for DNS names [kube-spawn-default-master-5y7clq localhost] and IPs [127.0.0.1 ::1] [certificates] Generated etcd/peer certificate and key. [certificates] etcd/peer serving cert is signed for DNS names [kube-spawn-default-master-5y7clq localhost] and IPs [10.22.0.3 127.0.0.1 ::1] [certificates] Generated etcd/healthcheck-client certificate and key. [certificates] Generated apiserver-etcd-client certificate and key. [certificates] valid certificates and keys now exist in "/etc/kubernetes/pki" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf" [controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml" [controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml" [controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml" [etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml" [init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" [init] this might take a minute or longer if the control plane images have to be pulled [apiclient] All control plane components are healthy after 42.001677 seconds [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.11" in namespace kube-system with the configuration for the kubelets in the cluster [markmaster] Marking the node kube-spawn-default-master-5y7clq as master by adding the label "node-role.kubernetes.io/master=''" [markmaster] Marking the node kube-spawn-default-master-5y7clq as master by adding the taints [node-role.kubernetes.io/master:NoSchedule] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-spawn-default-master-5y7clq" as an annotation [bootstraptoken] using token: 1o71nu.v7s48wncryhbdmm7 [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 10.22.0.3:6443 --token 1o71nu.v7s48wncryhbdmm7 --discovery-token-ca-cert-hash sha256:c8ac2337adc7ed01725bed7d78605661dc759257fce213838f1cb89486fe263c I0630 14:23:47.569329 1140 feature_gate.go:230] feature gates: &{map[]} aaaaaa.bbbbbbbbbbbbbbbb serviceaccount/weave-net created clusterrole.rbac.authorization.k8s.io/weave-net created clusterrolebinding.rbac.authorization.k8s.io/weave-net created daemonset.extensions/weave-net created worker-2ujr2f [preflight] running pre-flight checks [WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{}] you can solve this problem with following methods:More debug info:
The third machine does not exist anymore?