Closed sdake closed 7 years ago
No network driver (calico, flannel, weave tried) works.
i'll update now. @nadeemanwar this needs to be P1. jenkins to autoupdate submodule when a new commit is made on halcyon-kubernetes. i'd like to get this working ASAP, especially since kolla is releasing instructions on their repo soon (monday as @sdake mentioned in IRC).
@sdake Unfortunately I cannot reproduce :( could you paste the output from ansible? and run (set -x; kubectl version; ansible --version; vagrant --version)
so I can try and get to the bottom of whats going on?
well, @intlabs ...it is true that submodule needs updated. let me update manually now.
but then again...this can always be updated manually :)
@intlabs will do - note another person bmace has the same exact issue so it isn't a single time thing. I also have a tarball i can give you of an environment that works (that is a few weeks old iirc).
submodule updated, in case that may be the issue @sdake. try to clone or update the submodule again?
@v1k0d3n Thanks for updating submodule. You will note in the original report that I had updated the submodule myself, but may have done it incorrectly. I will test this evening with the revised documentation (there was a documentation bug in kolla-kubernetes - a typo) as well as the new submodule. Note that bmace had not used the documentation to C&P so we don't think it was the typo.
ah, i see now...you pulled from master; cool. could you report what hypervisor you're using too? we're going to start using templates for issues in github soon, but until that point...we need to ask versions, hypervisor info, os, etc. getting there.
Sure happy to help.
Host OS: CentOS 7.2 hypervisor: libvirt (latest yum update version - can edit with version when I have access to my lab) Virtual machine OS: CentOS (not sure which version) Memory: 32gb CPU: Intel core i7 with hyperthreading
Note I may not actually get to fulfilling all of the requests until the 28th of January.
@sdake if you could run and paste the output from the following from the root for your halcyon-vagrant-kubernetes
dir it would be awesome:
(set -x; kubectl version; ansible --version; vagrant --version; git log -1 --format="%H"; cd halcyon-kubernetes; git log -1 --format="%H")
It would also be great if you could post the full config.rb
you have, from the snippets I looked through on IRC I'm wondering if somehow it's still using my old fork, as this would explain the controller-manager image version that it seems to be trying to use?
@sdake what's the status of this issue? still outstanding?
@v1k0d3n I seem to have fallen off of a cliff - in fact my wife is traveling so I have no backup for my children. They have a wacky schedule (that is abnormal) the last 7 days (she is back late tonight) and hopefully I will be able to test things.
I did hear from Borne Mace that he is suffering a different issue then I reported also related to networking. I had hoped he would report in the halcyon-vagrant-kubernetes issue tracker. I'll ping him in irc and ask him to provide his feedback (which may be a different issue). His current reported state is that he was able to launch busybox, so this issue may be invalid and an artifact of the fact that the kolla development environment documentation is incorrect. I want to validate this personally since I opened the issue tracker before closing it out.
@v1k0d3n This has been reported as resolved in conversations on IRC with @sdake. Unfortunately, we never quite got to the exact root cause, so I think this may have been a transient issue with a distro package? I've also got confirmation that networking is working for Borne Mace, so closing for now.
@sdake, all, I saw the exact same error (below) yesterday.
Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "33400817-9413-4636-82f3-34c35d054a95_default" with SetupNetworkError: "Failed to setup network for pod "33400817-9413-4636-82f3-34c35d054a95_default(f5608278-e44b-11e6-9cf2-5254002f7f54)" using network plugins "cni": cni config unintialized; Skipping pod"
After waiting 30-40 minutes, I observed that k8s self-healed automatically, and I was able to run the busybox
pod successfully without this error.
Here is my analysis, I think we need to give k8s some time to stabilize all of its system objects (pods, deployments and replicasets).
I think k8s is stabilized when we see:
1) Running
under the STATUS
column for the k8s pods. See below.
# kubectl --namespace=kube-system get pods
NAME READY STATUS RESTARTS AGE
dummy-2088944543-vjhmg 1/1 Running 0 1d
etcd-172.16.35.11 1/1 Running 0 1d
kube-apiserver-172.16.35.11 1/1 Running 4 1d
kube-controller-manager-172.16.35.11 1/1 Running 0 1d
kube-discovery-1769846148-vmg4d 1/1 Running 0 1d
kube-dns-2924299975-wpbgf 4/4 Running 0 1d
kube-proxy-7hn7j 1/1 Running 1 1d
kube-proxy-gxf9w 1/1 Running 0 1d
kube-proxy-vcckr 1/1 Running 0 1d
kube-proxy-w6jhx 1/1 Running 0 1d
kube-scheduler-172.16.35.11 1/1 Running 1 1d
kubernetes-dashboard-3203831700-1fg2q 1/1 Running 0 1d
tiller-deploy-2885612843-9jjhv 1/1 Running 0 1d
weave-net-4rxlv 2/2 Running 0 1d
weave-net-fmwm4 2/2 Running 1 1d
weave-net-hjw8t 2/2 Running 1 1d
weave-net-p6fz6 2/2 Running 125 1d
I saw the error when some pods above had ContainerCreating
under the STATUS
column. I had to wait few minutes until all the pods showed Running
under the STATUS
column.
2) 1
under the AVAILABLE
column for all the k8ds deployments. See below.
# kubectl --namespace=kube-system get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-discovery 1 1 1 1 1d
kube-dns 1 1 1 1 1d
kubernetes-dashboard 1 1 1 1 1d
tiller-deploy 1 1 1 1 1d
3) 1
under the READY
column for all the replicasets. See below.
# kubectl --namespace=kube-system get replicasets
NAME DESIRED CURRENT READY AGE
dummy-2088944543 1 1 1 1d
kube-discovery-1769846148 1 1 1 1d
kube-dns-2924299975 1 1 1 1d
kubernetes-dashboard-3203831700 1 1 1 1d
tiller-deploy-2885612843 1 1 1 1d
cloned repo d6a500265e5d43e34f40e14c210584052610afca
The submodule is not up to date, so I updated that to obtain the latest version of halcyon-kubernetes by running: git submodule foreach git pull origin master
After following instructions here: http://docs.openstack.org/developer/kolla-kubernetes/development-environment.html
Ran into the following:
[sdake@minime-03 halcyon-vagrant-kubernetes]$ kubectl describe pods Name: 33400817-9413-4636-82f3-34c35d054a95 Namespace: default Node: 172.16.35.14/172.16.35.14 Start Time: Thu, 26 Jan 2017 23:49:17 -0500 Labels:
Status: Pending
IP:
Containers:
33400817-9413-4636-82f3-34c35d054a95:
Container ID:
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-26m8j:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-26m8j
QoS Class: BestEffort
Tolerations:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
Controllers:
Image: busybox Image ID:
Port:
State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Volume Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-26m8j (ro) Environment Variables:
11m 11m 1 {default-scheduler } Normal Scheduled Successfully assigned 33400817-9413-4636-82f3-34c35d054a95 to 172.16.35.14 10m 0s 636 {kubelet 172.16.35.14} Warning FailedSync Error syncing pod, skipping: failed to "SetupNetwork" for "33400817-9413-4636-82f3-34c35d054a95_default" with SetupNetworkError: "Failed to setup network for pod \"33400817-9413-4636-82f3-34c35d054a95_default(f5608278-e44b-11e6-9cf2-5254002f7f54)\" using network plugins \"cni\": cni config unintialized; Skipping pod"