Closed jimmycuadra closed 7 years ago
Exact same thing happening to me on Ubuntu 16.04.02, both GCE and local VMWare installations, Docker version 1.12.6, kernel 4.8.0-44-generic 47~16.04.1-Ubuntu SMP.
The kubelet log shows a warning about missing /etc/cni/net.d before the error that we see in jimmycuadra's report:
Mar 29 04:43:25 instance-1 kubelet[6800]: W0329 04:43:25.763117 6800 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 29 04:43:25 instance-1 kubelet[6800]: E0329 04:43:25.763515 6800 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Same issue on Ubuntu AWS VM. Docker 1.12.5
root@ip-10-43-0-20:~# kubeadm version kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5"
root@ip-10-43-0-20:~# uname -a Linux ip-10-43-0-20 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@ip-10-43-0-20:~# kubeadm init --config cfg.yaml [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [init] Using Kubernetes version: v1.6.0 [init] Using Authorization mode: RBAC [init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster. (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose) [preflight] Running pre-flight checks [preflight] Starting the kubelet service [certificates] Generated CA certificate and key. [certificates] Generated API server certificate and key. [certificates] API Server serving cert is signed for DNS names [ip-10-43-0-20 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.43.0.20] [certificates] Generated API server kubelet client certificate and key. [certificates] Generated service account token signing key and public key. [certificates] Generated front-proxy CA certificate and key. [certificates] Generated front-proxy client certificate and key. [certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf" [apiclient] Created API client, waiting for the control plane to become ready [apiclient] All control plane components are healthy after 16.531681 seconds [apiclient] Waiting for at least one node to register and become ready [apiclient] First node has registered, but is not ready yet [apiclient] First node has registered, but is not ready yet [apiclient] First node has registered, but is not ready yet
++ the same issue (Ubuntu 16.04.1)
Same thing here on Ubuntu 16.04
On CentOS 7, I downgraded the kubelet to 1.5.4
. That solved it for me. It seems like the ready check works different in the 1.6.0
kubelet.
Same issue on CentOS 7 on bare metal x64 machine, since upgrading to k8s 1.6.0
Same issue on Ubuntu 16.04
Same issue on Ubuntu 16.04, manually downgrading the kubelet
package solved the issue.
# apt install kubelet=1.5.6-00
@ctrlaltdel it didn't work for me.
I suspect this is a Kubelet issue. It shouldn't mark node as not ready when CNI is unconfigured. Only pods that require CNI should be marked as not ready.
@jbeda Do you know when will this issue be resolved?
@kristiandrucker -- no -- still figuring out what is going on. Need to root cause it first.
@jbeda Ok, but after the issue will be resolved, then what? Rebuild kubelet from source?
@kristiandrucker This'll have to go out in a point release of k8s if it is a kubelet issue.
I suspect that https://github.com/kubernetes/kubernetes/pull/43474 is the root cause. Going to file a bug and follow up with the network people.
@dcbw You around?
Looks like the issue is that a DaemonSet is not scheduled to nodes that have the NetworkReady:false condition, because the checks for scheduling pods are not fine-grained enough. We need to fix that; a pod that is hostNetwork:true should be scheduled on a node that is NetworkReady:false, but a hostNetwork:false pod should not.
As a workaround, does adding the scheduler.alpha.kubernetes.io/critical-pod
annotation on your DaemonSet make things work again?
@janetkuo @lukaszo can you triage the DS behavior?
There is also an ongoing discussion in #sig-network on slack, btw.
Same issue CentOS 7 x64
@prapdm this appears to undefended of what distro you are running.
CentOS Linux release 7.3.1611 (Core)
I've tried it on one node with Ubuntu 16.04. It hangs with the "not ready yet" message. I also manually created flannel DaemonSet but in my case it scheduled one pod without any problem. The daemon pod itself went in to the CrashLoopBackOff with error: E0329 22:57:03.065651 1 main.go:127] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-z3xgn': the server does not allow access to the requested resource (get pods kube-flannel-ds-z3xgn)
I will try on Centos also but I don't think that DaemonSet is to blame here, kubeadm hangs here.
that is an rbac permission error.
@jimmycuadra I've just noticed that you are running it on raspberry pi which has an arm processor.
For flannel daemon set you have:
beta.kubernetes.io/arch: amd64
but your node is labeled with:
beta.kubernetes.io/arch=arm
So DaemonSet can not lunch pod on this node, just change the node selector and it will work. You will still get the error with rbac permission but maybe @mikedanese will tell you how to fix it because I don't know it.
Ah, thanks @lukaszo! I wasn't following the RPi-specific guide this time (which I used for k8s 1.5) and forgot that step. I would've discovered it when the daemon set errored, but as it turns out I didn't get that far. :}
I'm also seeing this problem when I follow the instructions as described here: https://blog.hypriot.com/post/setup-kubernetes-raspberry-pi-cluster/
managed to get it working after installing the right flannel network pod.
I think that @jimmycuadra might get it working with @lukaszo comment.
When the message [apiclient] First node has registered, but is not ready yet
start flooding the kubernetes API server would be running so you can:
curl -sSL https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml | kubectl create -f -
For the raspberry pi install:
curl -sSL https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
Then it will finish:
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node has registered, but is not ready yet
[apiclient] First node is ready after 245.050597 seconds
[apiclient] Test deployment succeeded
[token] Using token: 4dc99e............
[apiconfig] Created RBAC rules
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 4dc99e........... 192.168.1.200:6443
I had the same issue and i fixed this way : you should be root
in the 1.6.0 of kubeadm you should remove the environment variable $KUBELET_NETWORK_ARGS in the system file : /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
then restart demons
systemctl daemon-reload
kubeadm init
this take a little while ... after success
download the network add-on you want to use : http://kubernetes.io/docs/admin/addons/
calico seems to be the best one, not sure but still in test for me.
@thelastworm I just tried to do it, and it didn't work. Ubuntu 16.04.2 LTS, kubeadm 1.6.0 I did the following steps:
kubeadm reset
to clean up previous attempt to start itkubeadm init --token=<VALUE> --apiserver-advertise-address=<IP>
[EDITED]
It worked after @srinat999 pointed to a necessity of running systemctl daemon-reload
before kubeadm init
@jcorral's solution worked for me with one change to the flannel deployment since the insecure API port is no longer created by kubeadm
.
curl -sSL https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml | \
kubectl --kubeconfig /etc/kubernetes/admin.conf create -f -
@MaximF You have to do systemctl daemon-reload
after changing the conf file. Worked for me.
@jcorral Your solution works for me. Thanks.
@MaximF i just add the restart demon command line
kubeadm init completes successfully, but when I check the version, get the following error:
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:36:33Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} The connection to the server localhost:8080 was refused - did you specify the right host or port?
@haribole You should set the KUBECONFIG env var
Has anyone got Flannel to run after the workarounds related to CNI? I can get passed the not ready issue, but when I run Flannel, I get an error that looks like this:
Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-g5cbj': the server does not allow access to the requested resource (get pods kube-flannel-ds-g5cbj)
Pods status shows "CrashLoopBackOff"
You need to add rbac roles to authorize flannel to read from the API.
You need to add rbac roles to authorize flannel to read from the API.
In case anyone else is wondering what this means, it looks like you need to create kube-flannel-rbac.yml
before you create flannel:
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
I think because a root issue is solved and related ticket is closed, we should close this one as well :)
Just for information: It is working for me now with the updated packages under Ubuntu 16.04.
1.6.1 works for me! Thanks to everyone that helped get this fix out!
I successfully setup my Kubernetes cluster on centos-release-7-3.1611.el7.centos.x86_64 by taking the following steps (I assume Docker is already installed):
1) (from /etc/yum.repo.d/kubernetes.repo) baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64-unstable
=> To use the unstable repository for the latest Kubernetes 1.6.1
2) yum install -y kubelet kubeadm kubectl kubernetes-cni
3) (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf) add "--cgroup-driver=systemd" at the end of the last line.
=> This is because Docker uses systemd for cgroup-driver while kubelet uses cgroupfs for cgroup-driver.
4) systemctl enable kubelet && systemctl start kubelet
5) kubeadm init --pod-network-cidr 10.244.0.0/16
=> If you used to add --api-advertise-addresses, you need to use --apiserver-advertise-address instead.
6) cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
=> Without this step, you might get an error with kubectl get
=> I didn't do it with 1.5.2
7) kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
=> 1.6.0 introduces a role-based access control so you should add a ClusterRole and a ClusterRoleBinding before creating a Flannel daemonset
8) kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
=> Create a Flannel daemonset
9) (on every slave node) kubeadm join --token (your token) (ip):(port)
=> as shown in the result of kubeadm init
All the above steps are a result of combining suggestions from various issues around Kubernetes-1.6.0, especially kubeadm.
Hope this will save your time.
@eastcirclek @Sliim You are great
@eastcirclek this were the exact steps that I have just executed by querying several forums too. A timezone difference, maybe? Thanks everyone, this topic was really helpful.
I have Ubuntu 16.04 server on AWS and followed the steps
which apparently worked correctly, but then when I try to install Calico as network plugin, I get the following error The connection to the server localhost:8080 was refused - did you specify the right host or port?
Is the k8s team working on a patch?
Thanks
@overip I don't think any patch is required for that... You just need to specify the right kubeconfig file when using kubectl. kubeadm should have written it to /etc/kubernetes/admin.conf
.
@jimmycuadra could you please explain the steps to do that?
@overip The output of kubeadm init
have the instructions:
To start using your cluster, you need to run (as a regular user):
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
Personally, I prefer to copy the file to $HOME/.kube/config
, which is where kubectl will look for it by default. Then you don't need to set the KUBECONFIG environment variable.
If you are planning to use kubectl from your local machine, you can use scp
(or even just copy paste the contents) to write it to ~/.kube/config
on your own computer.
Search for "admin.conf" in this GitHub issue for more details. It's been mentioned a few times.
@eastcirclek - followed the steps, but for some reason the nodes are not able to install flannel properly. (Note: on master everything is smooth.)
Apr 13 22:31:11 node2 kubelet[22893]: I0413 22:31:11.666206 22893 kuberuntime_manager.go:458] Container {Name:install-cni Image:quay.io/coreos/flannel:v0.7.0-amd64 Command:[/bin/sh -c set -e -x; cp -f /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conf; while true; do sleep 3600; done] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:cni ReadOnly:false MountPath:/etc/cni/net.d SubPath:} {Name:flannel-cfg ReadOnly:false MountPath:/etc/kube-flannel/ SubPath:} {Name:flannel-token-g65nf ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Apr 13 22:31:11 node2 kubelet[22893]: I0413 22:31:11.666280 22893 kuberuntime_manager.go:742] checking backoff for container "install-cni" in pod "kube-flannel-ds-3smf7_kube-system(2e6ad0f9-207f-11e7-8f34-0050569120ff)"
Apr 13 22:31:12 node2 kubelet[22893]: I0413 22:31:12.846325 22893 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/configmap/2e6ad0f9-207f-11e7-8f34-0050569120ff-flannel-cfg" (spec.Name: "flannel-cfg") pod "2e6ad0f9-207f-11e7-8f34-0050569120ff" (UID: "2e6ad0f9-207f-11e7-8f34-0050569120ff").
Apr 13 22:31:12 node2 kubelet[22893]: I0413 22:31:12.846373 22893 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2e6ad0f9-207f-11e7-8f34-0050569120ff-flannel-token-g65nf" (spec.Name: "flannel-token-g65nf") pod "2e6ad0f9-207f-11e7-8f34-0050569120ff" (UID: "2e6ad0f9-207f-11e7-8f34-0050569120ff").
Just share my workaround method. Firstly $KUBELET_NETWORK_ARGS is required, otherwise CNI is not enabled/configured. Removing and then restoring $KUBELET_NETWORK_ARGS seems too complicated. When kubeadm init shows "[apiclient] First node has registered, but is not ready yet", the k8s cluster actually is ready to serve request. At that time, user could simply move to step 3/4 of https://kubernetes.io/docs/getting-started-guides/kubeadm/ as follows.
To start using your cluster, you need to run (as a regular user):
sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: http://kubernetes.io/docs/admin/addons/
When a user installs the podnetwork, please make sure the serviceaccount of podnetwork policy is granted enough permission. Taking flannel as an example. I just bind cluster-admin role to service account of flannel as follows. It may not be ideal, and you could define a specific role for flannel serviceacount. BTW, when a user deploy other addon service like dashboard, it also requires to grant enough permission to the related serviceaccount.
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: flannel:daemonset
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
After the podnetwork server is ready, kubeadm init will shows the node is ready, and the the user could continue with the instrution.
Taking flannel as an example. I just bind cluster-admin role to service account of flannel as follows. It may not be ideal, and you could define a specific role for flannel serviceacount.
There is https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml already
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): kubeadm
Is this a BUG REPORT or FEATURE REQUEST? (choose one): bug report
Kubernetes version (use
kubectl version
): 1.6.0Environment:
uname -a
): 4.4.50-hypriotos-v7+What happened:
Following the kubeadm getting started guide exactly:
That last message, "First node has registered, but is not ready yet" repeats infinitely, and kubeadm never finishes. I connected to the master server in another session to see if all the Docker containers were running as expected and they are:
I copied the admin kubeconfig to my local machine and used kubectl (1.6.0) to see what was going on with the node kubeadm was claiming was registered:
This uncovered the reason the kubelet was not ready:
"runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config"
In my experiments with kubeadm 1.5, CNI was not needed to bring up the master node, so this is surprising. Even the getting started guide suggests that
kubeadm init
should finish successfully before you move on to deploying a CNI plugin.Anyway, I deployed flannel using kubectl from my local machine:
Where the contents of the file was:
But it never scheduled:
I tried to join one of the other servers anyway, just to see what would happen. I used
kubeadm token create
to manually create a token that I could use from another machine. On the other machine:And the final message repeated forever.
What you expected to happen:
kubeadm init
should complete and produce a bootstrap token.