kubernetes / kubeadm

Aggregator for issues filed against kubeadm
Apache License 2.0
3.76k stars 716 forks source link

Kubeadm on arm... #65

Closed mikedanese closed 8 years ago

mikedanese commented 8 years ago

From @derailed on October 26, 2016 17:23

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):

No

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):

kubeadm arm

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

Bug

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.3", 
GitCommit:"4957b090e9a4f6a68b4a40375408fdc74a212260", GitTreeState:"clean", 
BuildDate:"2016-10-16T06:36:33Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/arm"}

Environment:

Raspberry Pi 3

HypriotOS v 1.0

Linux m10 4.4.15-hypriotos-v7+ #1 SMP PREEMPT Mon Jul 25 08:46:52 UTC 2016 armv7l GNU/Linux

What happened:

Following kubeadm installation docs. I've installed the prereqs and proceeded with the init as follows:

kubeadm init --use-kubernetes-version v1.4.1 --pod-network-cidr=10.244.0.0/16

The command is stuck on:

master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready

Looking at docker on this node expecting to see images being installed for etcd, api-server, etc.. but docker images reports nothing. Guessing kubeadm is somehow no able to connect with local docker daemon? or the default Hypriot docker configuration is not jiving with kubeadm??

What you expected to happen:

kubeadm init to complete successfully

How to reproduce it (as minimally and precisely as possible):

o Download Hypriot v1.0 image and flash on SD card o Boot rp with SD card o curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - o cat < /etc/apt/sources.list.d/kubernetes.list deb http://apt.kubernetes.io/ kubernetes-xenial main EOF o apt-get update o apt-get install -y kubelet kubeadm kubectl kubernetes-cni o kubeadm init --use-kubernetes-version v1.4.1 --pod-network-cidr=10.244.0.0/16

Anything else do we need to know:

Looking at /etc/kubernetes/manifests shows:

etcd.json kube-apiserver.json kube-controller-manager.json kube-scheduler.json

Inspecting journalctl on kubelet shows:

Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.467776    2787 kubelet_node_status.go:203] Setting node annotation to enable volume controller attach/detach
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.469033    2787 interface.go:248] Default route transits interface "eth0"
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.470261    2787 interface.go:93] Interface eth0 is up
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.471170    2787 interface.go:138] Interface "eth0" has 2 addresses :[192.168.0.92/24 fe80::ba27:ebff:fe24:986f/64].
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.471484    2787 interface.go:105] Checking addr  192.168.0.92/24.
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.471664    2787 interface.go:114] IP found 192.168.0.92
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.471831    2787 interface.go:144] valid IPv4 address for interface "eth0" found as 192.168.0.92.
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.471960    2787 interface.go:254] Choosing IP 192.168.0.92
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.497415    2787 kubelet_node_status.go:354] Recording NodeHasSufficientDisk event message for node m10
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.497616    2787 kubelet_node_status.go:354] Recording NodeHasSufficientMemory event message for node m10
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.497905    2787 kubelet_node_status.go:354] Recording NodeHasNoDiskPressure event message for node m10
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.498077    2787 kubelet_node_status.go:73] Attempting to register node m10
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.498243    2787 server.go:608] Event(api.ObjectReference{Kind:"Node", Namespace:"", Name:"m10", UID:"m10", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasSufficientDisk' Node m10 status is now: NodeHasSufficientDisk
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.498554    2787 server.go:608] Event(api.ObjectReference{Kind:"Node", Namespace:"", Name:"m10", UID:"m10", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasSufficientMemory' Node m10 status is now: NodeHasSufficientMemory
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.498823    2787 server.go:608] Event(api.ObjectReference{Kind:"Node", Namespace:"", Name:"m10", UID:"m10", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeHasNoDiskPressure' Node m10 status is now: NodeHasNoDiskPressure
Oct 26 17:19:31 m10 kubelet[2787]: E1026 17:19:31.501189    2787 kubelet_node_status.go:97] Unable to register node "m10" with API server: Post https://192.168.0.92:443/api/v1/nodes: dial tcp 192.168.0.92:443: getsockopt: connection refused
Oct 26 17:19:31 m10 kubelet[2787]: I1026 17:19:31.527656    2787 reflector.go:249] Listing and watching *api.Pod from pkg/kubelet/config/apiserver.go:43
Oct 26 17:19:31 m10 kubelet[2787]: E1026 17:19:31.531088    2787 reflector.go:203] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get https://192.168.0.92:443/api/v1/pods?fieldSelector=spec.nodeName%3Dm10&resourceVersion=0: dial tcp 192.168.0.92:443: getsockopt: connection refused
Oct 26 17:19:32 m10 kubelet[2787]: I1026 17:19:32.416745    2787 reflector.go:249] Listing and watching *api.Service from pkg/kubelet/kubelet.go:384
Oct 26 17:19:32 m10 kubelet[2787]: E1026 17:19:32.418995    2787 reflector.go:203] pkg/kubelet/kubelet.go:384: Failed to list *api.Service: Get https://192.168.0.92:443/api/v1/services?resourceVersion=0: dial tcp 192.168.0.92:443: getsockopt: connection refused
Oct 26 17:19:32 m10 kubelet[2787]: I1026 17:19:32.455119    2787 reflector.go:249] Listing and watching *api.Node from pkg/kubelet/kubelet.go:403
Oct 26 17:19:32 m10 kubelet[2787]: E1026 17:19:32.457560    2787 reflector.go:203] pkg/kubelet/kubelet.go:403: Failed to list *api.Node: Get https://192.168.0.92:443/api/v1/nodes?fieldSelector=metadata.name%3Dm10&resourceVersion=0: dial tcp 192.168.0.92:443: getsockopt: connection refused
Oct 26 17:19:32 m10 kubelet[2787]: I1026 17:19:32.531641    2787 reflector.go:249] Listing and watching *api.Pod from pkg/kubelet/config/apiserver.go:43
Oct 26 17:19:32 m10 kubelet[2787]: E1026 17:19:32.534040    2787 reflector.go:203] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get https://192.168.0.92:443/api/v1/pods?fieldSelector=spec.nodeName%3Dm10&resourceVersion=0: dial tcp 192.168.0.92:443: getsockopt: connection refused
Oct 26 17:19:33 m10 kubelet[2787]: I1026 17:19:33.419424    2787 reflector.go:249] Listing and watching *api.Service from pkg/kubelet/kubelet.go:384
Oct 26 17:19:33 m10 kubelet[2787]: E1026 17:19:33.421922    2787 reflector.go:203] pkg/kubelet/kubelet.go:384: Failed to list *api.Service: Get https://192.168.0.92:443/api/v1/services?resourceVersion=0: dial tcp 192.168.0.92:443: getsockopt: connection refused

Copied from original issue: kubernetes/kubernetes#35643

mikedanese commented 8 years ago

From @errordeveloper on October 26, 2016 17:28

cc @luxas

mikedanese commented 8 years ago

From @luxas on October 26, 2016 17:31

Did you use Hypriot v1.0.1 as stated in docs?

mikedanese commented 8 years ago

From @nlamirault on October 26, 2016 18:39

@luxas I've got the same issue with Hypriot 1.0.0. Where i can download the 1.0.1 image ? I try to download https://downloads.hypriot.com/hypriotos-rpi-v1.0.1.img.zip, but i've got an error. Could you send the link for documentation which explain how use Kubeadm on Hypriot. I could try a new installation.

mikedanese commented 8 years ago

From @derailed on October 26, 2016 18:49

Nicolas - try https://github.com/hypriot/image-builder-rpi/releases

On Wed, Oct 26, 2016 at 12:40 PM, Nicolas Lamirault < notifications@github.com> wrote:

@luxas https://github.com/luxas I've got the same issue with Hypriot 1.0.0. Where i can download the 1.0.1 image ? I try to download https://downloads.hypriot.com/ hypriotos-rpi-v1.0.1.img.zip, but i've got an error. Could you send the link for documentation which explain how use Kubeadm on Hypriot. I could try a new installation.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/35643#issuecomment-256439414, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAP3KdNzCgFef85JvCFpd-HuE493uidks5q356sgaJpZM4KhcUp .

mikedanese commented 8 years ago

From @derailed on October 26, 2016 19:25

Thanks for pointing this out @luxas! This was indeed a pilot error. Did get further so we can close this issue. Tho the master is now coming up, setting up the pod network using flannel per the docs results in an error.

ARCH=arm curl -sSL https://raw.githubusercontent.com/luxas/flannel/update-daemonset/Documentation/kube-flannel.yml | sed "s/amd64/${ARCH}/g" | kubectl create -f -

Yields

Error from server: error when creating "flannel.yml": DaemonSet in version "v1beta1" cannot be handled as a DaemonSet: [pos 1115]: json: expect char '"' but got char 'n'

Think the node selector config in the flannel config is incorrect

nodeSelector: beta.kubernetes.io/arch:

mikedanese commented 8 years ago

From @derailed on October 26, 2016 19:31

Probably a bug in the template, me think the nodeSelector should be:

nodeSelector: beta.kubernetes.io/arch: arm

mikedanese commented 8 years ago

From @luxas on October 26, 2016 19:34

Ok, seems like the ARCH=arm thing is working poorly then.

Try just running

curl -sSL https://raw.githubusercontent.com/luxas/flannel/update-daemonset/Documentation/kube-flannel.yml | sed "s/amd64/arm/g" | kubectl create -f -
mikedanese commented 8 years ago

From @derailed on October 26, 2016 21:26

Thank you all for the prompt support! Finally got this thing up and running. Watching this cluster live on rpi is a beautiful thing. Totally impressed by your work. kubeadm rocks. Tx!!

mikedanese commented 8 years ago

From @nlamirault on October 27, 2016 11:52

using v1.4.4, process go to the end :

$ sudo kubeadm init --use-kubernetes-version v1.4.4 --api-advertise-addresses=192.168.1.23 --pod-network-cidr=10.244.0.0/16
<master/tokens> generated token: "482d60.042c2504ce81cd32"
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
<master/apiclient> all control plane components are healthy after 38.573899 seconds
<master/apiclient> waiting for at least one node to register and become ready
<master/apiclient> first node is ready after 1.535058 seconds
<master/discovery> created essential addon: kube-discovery, waiting for it to become ready
<master/discovery> kube-discovery is ready after 814.040634 seconds
<master/addons> created essential addon: kube-proxy
<master/addons> created essential addon: kube-dns

Kubernetes master initialised successfully!

You can now join any number of machines by running the following on each node:

kubeadm join --token 482d60.042c2504ce81cd32 192.168.1.23

But kubectl fails :

$ kubectl get pods --namespace=kube-system
client: etcd cluster is unavailable or misconfigured

The etcd container logs : https://gist.github.com/nlamirault/d84e77e02276d158493f15f249324ba5

mikedanese commented 8 years ago

From @nlamirault on October 27, 2016 12:14

i try the unstable version of kubeadm with reset command. Then i try another init. Etcd container is up, but i've got logs :

2016-10-27 12:14:14.104739 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-10-27 12:14:23.319372 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-10-27 12:14:34.064923 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
mikedanese commented 8 years ago

From @derailed on October 27, 2016 15:50

Thanks Lucas for pointing this out. I've totally missed it, my bad...

We can close this issue as this clearly was a pilot error. However I am running into another problem setting up the pod network

The flannel daemonset config does not seem to be valid. Guessing it's missing spec.selector?? but not for sure as the error given is not super useful

So

ARCH=arm curl -sSL https://raw.githubusercontent.com/luxas/flannel/update-daemonset/Documentation/kube-flannel.yml | sed "s/amd64/${ARCH}/g" | kubectl create -f -

Yields

Error from server: error when creating "flannel.yml": DaemonSet in version "v1beta1" cannot be handled as a DaemonSet: [pos 1115]: json: expect char '"' but got char 'n'

On Wed, Oct 26, 2016 at 11:32 AM, Lucas Käldström notifications@github.com wrote:

Did you use Hypriot v1.0.1 as stated in docs?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/35643#issuecomment-256420803, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAP3HbPUvhdC7dKAeV1-g4JXAE3Iy3xks5q346ygaJpZM4KhcUp .

mikedanese commented 8 years ago

From @nlamirault on October 31, 2016 17:3

OK. I recreate the cluster using a new installation of HypriotOS 1.1.0 and using version unstable of kubeadm.

$ sudo kubeadm init --use-kubernetes-version v1.4.4 --api-advertise-addresses=192.168.1.23 --pod-network-cidr=10.244.0.0/16
$ kubectl create -f https://raw.githubusercontent.com/kodbasen/weave-kube-arm/master/weave-daemonset.yaml

I've got errors like that :

etcd cluster is unavailable or misconfigured

Some informations :

$ kubectl get deployments --all-namespaces
NAMESPACE     NAME             DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   kube-discovery   1         1         1            1           1h
kube-system   kube-dns         1         1         1            0           1h
HypriotOS/armv7: root@jarvis in ~
$ kubectl get services --all-namespaces
NAMESPACE     NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       kubernetes   10.96.0.1    <none>        443/TCP         1h
kube-system   kube-dns     10.96.0.10   <none>        53/UDP,53/TCP   1h
HypriotOS/armv7: root@jarvis in ~
$ kubectl get pods --all-namespaces
NAMESPACE     NAME                              READY     STATUS      RESTARTS   AGE
kube-system   dummy-2501624643-psrv9            1/1       Running     2          1h
kube-system   etcd-jarvis                       1/1       Running     273        1h
kube-system   kube-apiserver-jarvis             1/1       Running     1          1h
kube-system   kube-controller-manager-jarvis    1/1       Running     217        1h
kube-system   kube-discovery-2202902116-zkjmy   1/1       Running     1          1h
kube-system   kube-dns-2334855451-zlgnh         0/3       Completed   25         1h
kube-system   kube-proxy-c8rax                  1/1       Running     1          1h
kube-system   kube-scheduler-jarvis             0/1       Error       225        1h
kube-system   weave-net-60pr6                   2/2       Running     9          1h
$ kubectl describe pod etcd-jarvis --namespace=kube-system
Error from server: client: etcd cluster is unavailable or misconfigured
$ kubectl describe pod etcd-jarvis --namespace=kube-system
Name:           etcd-jarvis
Namespace:      kube-system
Node:           jarvis/192.168.1.23
Start Time:     Mon, 31 Oct 2016 16:49:31 +0000
Labels:         component=etcd
                tier=control-plane
Status:         Running
IP:             192.168.1.23
Controllers:    <none>
Containers:
  etcd:
    Container ID:       docker://c31f82fa8da305260c912fd38028eb7699e5fce6ac10137bbcf1fe14a4ae9a40
    Image:              gcr.io/google_containers/etcd-arm:2.2.5
    Image ID:           docker://sha256:23e15ba74b830d4e9c1f09ce899864a5dde6636df6058f72b31e9694b8c511a3
    Port:
    Command:
      etcd
      --listen-client-urls=http://127.0.0.1:2379
      --advertise-client-urls=http://127.0.0.1:2379
      --data-dir=/var/etcd/data
    Requests:
      cpu:              200m
    State:              Running
      Started:          Mon, 31 Oct 2016 17:05:15 +0000
    Last State:         Terminated
      Reason:           Completed
      Exit Code:        0
      Started:          Mon, 31 Oct 2016 17:03:28 +0000
      Finished:         Mon, 31 Oct 2016 17:04:30 +0000
    Ready:              True
    Restart Count:      273
    Liveness:           http-get http://127.0.0.1:2379/health delay=15s timeout=15s period=10s #success=1 #failure=8
    Volume Mounts:
      /etc/kubernetes/ from pki (ro)
      /etc/ssl/certs from certs (rw)
      /var/etcd from etcd (rw)
    Environment Variables:      <none>
Conditions:
  Type          Status
  Initialized   True 
  Ready         True 
  PodScheduled  True 
Volumes:
  certs:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/ssl/certs
  etcd:
    Type:       HostPath (bare host directory volume)
    Path:       /var/lib/etcd
  pki:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/kubernetes
QoS Class:      Burstable
Tolerations:    <none>
No events.
mikedanese commented 8 years ago

From @viroos on November 2, 2016 1:39

Have similar issue.

After adding new node (kubeadm join --token ...) etcd is killed: etcd-black-pearl 0/1 Terminating 0 3s

Then it's restarted but everything goes crazy (all kube-system pods are being killed and recreated).

kubectl get nodes returns only master.

I tried f weave and flannel with the same issue. I use hypriot 1.10 (but i also tried 1.0.1 and had the same or similar issue) Last try was on Kubernetes 1.4.5 but on 1.4.4 and 1.4.3 I had the same problem.

In my case kubectl works (although I had to wait a little after configuring network cni since for minute or two i also had 'etcd cluster is unavailable or misconfigured error').

This is 100% reproducible (either with fresh install or using tear down procedure described at: http://kubernetes.io/docs/getting-started-guides/kubeadm/)

mikedanese commented 8 years ago

From @viroos on November 2, 2016 22:59

Additional info. In /var/log/syslog: Nov 2 22:56:29 black-pearl kubelet[1668]: W1102 22:56:29.327317 1668 status_manager.go:450] Failed to update status for pod "_()": Operation cannot be fulfilled on pods "etcd-black-pearl": StorageError: invalid object, Code: 4, Key: /registry/pods/kube-system/etcd-black-pearl, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 0x15c6f130, UID in object meta:

mikedanese commented 8 years ago

From @nlamirault on November 3, 2016 7:52

I've got same error i think. etcd, scheduler and controller-manager are killed and restarted :

$ sudo kubectl get pods --all-namespaces
NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   dummy-2501624643-psrv9                  1/1       Running   2          2d
kube-system   etcd-jarvis                             1/1       Running   298        2d
kube-system   kube-apiserver-jarvis                   1/1       Running   1          2d
kube-system   kube-controller-manager-jarvis          1/1       Running   390        2d
kube-system   kube-discovery-2202902116-zkjmy         1/1       Running   1          2d
kube-system   kube-dns-2334855451-zlgnh               3/3       Running   85         2d
kube-system   kube-proxy-c8rax                        1/1       Running   1          2d
kube-system   kube-scheduler-jarvis                   1/1       Running   409        2d
kube-system   kubernetes-dashboard-3628165297-692al   1/1       Running   0          2d
kube-system   weave-net-60pr6                         2/2       Running   16         2d

The etcd container logs :

$ docker logs -f 0bc2e8f42947
2016-11-02 18:16:41.581493 I | etcdmain: etcd Version: 2.2.5
2016-11-02 18:16:41.581813 I | etcdmain: Git SHA: bc9ddf2
2016-11-02 18:16:41.581884 I | etcdmain: Go Version: go1.6
2016-11-02 18:16:41.583009 I | etcdmain: Go OS/Arch: linux/arm
2016-11-02 18:16:41.583133 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2016-11-02 18:16:41.583598 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-11-02 18:16:41.593874 I | etcdmain: listening for peers on http://localhost:2380
2016-11-02 18:16:41.595057 I | etcdmain: listening for peers on http://localhost:7001
2016-11-02 18:16:41.596288 I | etcdmain: listening for client requests on http://127.0.0.1:2379
2016-11-02 18:16:42.578106 I | etcdserver: recovered store from snapshot at index 380048
2016-11-02 18:16:42.578246 I | etcdserver: name = default
2016-11-02 18:16:42.578316 I | etcdserver: data dir = /var/etcd/data
2016-11-02 18:16:42.579009 I | etcdserver: member dir = /var/etcd/data/member
2016-11-02 18:16:42.579141 I | etcdserver: heartbeat = 100ms
2016-11-02 18:16:42.579230 I | etcdserver: election = 1000ms
2016-11-02 18:16:42.579419 I | etcdserver: snapshot count = 10000
2016-11-02 18:16:42.579862 I | etcdserver: advertise client URLs = http://127.0.0.1:2379
2016-11-02 18:16:42.580132 I | etcdserver: loaded cluster information from store: <nil>
2016-11-02 18:16:46.351154 I | etcdserver: restarting member ce2a822cea30bfca in cluster 7e27652122e8b2ae at commit index 389115
2016-11-02 18:16:46.354603 I | raft: ce2a822cea30bfca became follower at term 32
2016-11-02 18:16:46.354883 I | raft: newRaft ce2a822cea30bfca [peers: [ce2a822cea30bfca], term: 32, commit: 389115, applied: 380048, lastindex: 389115, lastterm: 32]
2016-11-02 18:16:46.411499 I | etcdserver: starting server... [version: 2.2.5, cluster version: 2.2]
2016-11-02 18:16:48.315589 I | raft: ce2a822cea30bfca is starting a new election at term 32
2016-11-02 18:16:48.316314 I | raft: ce2a822cea30bfca became candidate at term 33
2016-11-02 18:16:48.316721 I | raft: ce2a822cea30bfca received vote from ce2a822cea30bfca at term 33
2016-11-02 18:16:48.317363 I | raft: ce2a822cea30bfca became leader at term 33
2016-11-02 18:16:48.317726 I | raft: raft.node: ce2a822cea30bfca elected leader ce2a822cea30bfca at term 33
2016-11-02 18:16:48.325990 I | etcdserver: published {Name:default ClientURLs:[http://127.0.0.1:2379]} to cluster 7e27652122e8b2ae
2016-11-02 18:17:01.082697 N | osutil: received terminated signal, shutting down...
2016-11-02 18:17:07.634768 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-11-02 18:17:07.643182 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-11-02 18:17:07.783677 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-11-02 18:17:09.218169 E | etcdhttp: got unexpected response error (etcdserver: request timed out)
2016-11-02 18:17:09.824865 E | etcdhttp: got unexpected response error (etcdserver: server stopped)
2016-11-02 18:17:09.825649 E | etcdhttp: got unexpected response error (etcdserver: server stopped)
mikedanese commented 8 years ago

From @viroos on November 3, 2016 22:41

I managed to make it working with Hypriot 1.1.0, K8 1.4.3 and weave. In my case the issue was the same host name of master and node.

http://larmog.github.io/2016/10/28/installing-kubernetes-on-arm-with-kubeadm/ - this works.

mikedanese commented 8 years ago

From @nlamirault on November 4, 2016 7:53

I've got also this logs in /var/log/syslog :

Nov 04 07:43:17 jarvis kubelet[395]: E1104 07:43:17.873889     395 event.go:199] Server rejected event '&api.Event{TypeMeta:unversioned.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:api.ObjectMeta{Name:"etcd-jarvis.1482abde90f339e3", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"270343", Generation:0, CreationTimestamp:unversioned.Time{Time:time.Time{sec:0, nsec:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*unversioned.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]api.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"etcd-jarvis", UID:"a19bb61234e539b2b8370ac597b940f0", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{etcd}"}, Reason:"Unhealthy", Message:"Liveness probe failed: HTTP probe failed with statuscode: 503", Source:api.EventSource{Component:"kubelet", Host:"jarvis"}, FirstTimestamp:unversioned.Time{Time:time.Time{sec:63613529400, nsec:0, loc:(*time.Location)(0x3306808)}}, LastTimestamp:unversioned.Time{Time:time.Time{sec:63613842190, nsec:749080856, loc:(*time.Location)(0x3306808)}}, Count:10093, Type:"Warning"}': 'client: etcd cluster is unavailable or misconfigured' (will not retry!)
Nov 04 07:43:17 jarvis kubelet[395]: E1104 07:43:17.934346     395 kubelet_node_status.go:301] Error updating node status, will retry: client: etcd cluster is unavailable or misconfigured
Nov 04 07:53:05 jarvis kubelet[395]: E1104 07:53:05.507143     395 kubelet_node_status.go:301] Error updating node status, will retry: client: etcd cluster is unavailable or misconfigured
Nov 04 07:53:07 jarvis kubelet[395]: E1104 07:53:07.556922     395 kubelet_node_status.go:301] Error updating node status, will retry: Operation cannot be fulfilled on nodes "jarvis": the object has been modified; please apply your changes to the latest version and try again

I'm using k8s with only master. I've got no nodes.

mikedanese commented 8 years ago

From @nlamirault on November 4, 2016 16:16

Controller manager have some errors like that :

I1104 15:31:25.389881       1 reflector.go:284] pkg/controller/volume/persistentvolume/controller_base.go:448: forcing resync
E1104 15:31:30.994090       1 leaderelection.go:317] err: client: etcd cluster is unavailable or misconfigured
E1104 15:31:31.086650       1 leaderelection.go:317] err: Operation cannot be fulfilled on endpoints "kube-controller-manager": the object has been modified; please apply your changes to the latest version and try again
I1104 15:31:31.090832       1 attach_detach_controller.go:520] processVolumesInUse for node "jarvis"
E1104 15:31:31.087267       1 event.go:258] Could not construct reference to: '&api.Endpoints{TypeMeta:unversioned.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:api.ObjectMeta{Name:"kube-controller-manager", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:unversioned.Time{Time:time.Time{sec:0, nsec:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*unversioned.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]api.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]api.EndpointSubset(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' '%v stopped leading' 'jarvis'
I1104 15:31:32.900385       1 leaderelection.go:232] failed to renew lease kube-system/kube-controller-manager
F1104 15:31:32.913336       1 controllermanager.go:195] leaderelection lost
mikedanese commented 8 years ago

From @brendandburns on November 14, 2016 6:29

I got these errors when I had a lousy SSD card. Switching to a higher performance card fixed things...

mikedanese commented 8 years ago

From @nlamirault on November 16, 2016 12:40

@brendandburns i will try that.

luxas commented 8 years ago

I'm closing this, as everything we can do to provide kubeadm on arm was done from the beginning (the initial v1.4 release), and from that it was enhanced a lot in the second revision, so it should be really smooth now.