luxas / kubernetes-on-arm

Kubernetes ported to ARM boards like Raspberry Pi.
MIT License
596 stars 86 forks source link

Working with Hypriot 1.0 and kubernetes-on-arm 0.7.x #118

Open saturnism opened 7 years ago

saturnism commented 7 years ago

A few things I found that needed to change when working w/ Hypriot 1.0:

  1. Host name needs to be updated in /boot/device-init.yaml
  2. docker-flannel overlay was not installed by default
  3. flannel.service needs After=system-docker.service to survive restarts
  4. I still need to run kubelet with --containerized in order to use with NFS PV, with docker volume mount -v /:/rootfs:ro
  5. Need to append /boot/cmdline.txt with cgroup_enable=cpuset
  6. Need to enable mtu probing to avoid docker pull problems: append net.ipv4.tcp_mtu_probing=1 to /etc/sysctl.conf (optionally swappiness? see http://a.frtzlr.com/kubernetes-on-raspberry-pi-3-the-missing-troubleshooting-guide/)
  7. PV recycling doesn't work, since it uses a non-arm based busybox image by default to recycle a volume. Still figuring out how to replace the image.
ghost commented 7 years ago

Thanks for this, am having issues running a Pi2/3 mixed cluster - OS OK - but Hypercube the only container running - issue connecting to the Apimanager

Can you please explain how to implement 2. ?

luxas commented 7 years ago

skipping pod synchronization - [Failed to start ContainerManager system validation failed - Following Cgroup subsystem not mounted: [cpuset] container runtime is down]

@pakeha-kiwi You have to set cgroup_enable=cpuset in /boot/cmdline.txt

erikthorselius commented 7 years ago

How do i fix the "docker-flannel overlay was not installed by default" problem?

It looks like they are install

systemd-delta --type=extended
[EXTENDED]   /lib/systemd/system/docker.service <E2><86><92> /etc/systemd/system/docker.service.d/overlay.conf
[EXTENDED]   /lib/systemd/system/docker.service <E2><86><92> /usr/lib/systemd/system/docker.service.d/docker-flannel.conf

But I can't see any trace when the process is running. And the network layer don't route right.

ps waux |grep fd
root     18232  1.3  4.1 1001548 36484 ?       Ssl  23:03   0:04 /usr/bin/dockerd --storage-driver overlay -H fd://
erikthorselius commented 7 years ago

Looks like the dropin fails when removing the interface.

systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
  Drop-In: /usr/lib/systemd/system/docker.service.d
           └─docker-flannel.conf
        /etc/systemd/system/docker.service.d
           └─overlay.conf
   Active: active (running) since Wed 2016-08-31 23:47:48 EEST; 2min 33s ago
     Docs: https://docs.docker.com
  Process: 3091 ExecStartPre=/bin/sh -c ifconfig docker0 down; brctl delbr docker0 (code=exited, status=1/FAILURE)
luxas commented 7 years ago

Is brctl installed?

erikthorselius commented 7 years ago

yes

$ brctl
Usage: brctl [commands]
commands:
    addbr       <bridge>        add bridge
    delbr       <bridge>        delete bridge
    addif       <bridge> <device>   add interface to bridge
    delif       <bridge> <device>   delete interface from bridge
    hairpin     <bridge> <port> {on|off}    turn hairpin on/off
    setageing   <bridge> <time>     set ageing time
    setbridgeprio   <bridge> <prio>     set bridge priority
    setfd       <bridge> <time>     set bridge forward delay
    sethello    <bridge> <time>     set hello time
    setmaxage   <bridge> <time>     set max message age
    setpathcost <bridge> <port> <cost>  set path cost
    setportprio <bridge> <port> <prio>  set port priority
    show        [ <bridge> ]        show a list of bridges
    showmacs    <bridge>        show a list of mac addrs
    showstp     <bridge>        show bridge stp info
    stp         <bridge> {on|off}   turn stp on/off
erikthorselius commented 7 years ago

I'm new to kubernetes and flannel but what I can see from ifconfig it looks like the usal docker0 interface.

ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:62:14:64:0f
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

but the containers looks like they are live and kicking

docker -H unix:///var/run/system-docker.sock ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS               NAMES
d33d6ae229b3        kubernetesonarm/etcd      "/usr/local/bin/etcd "   55 minutes ago      Up 55 minutes                           k8s-etcd
fd596c1f48f9        kubernetesonarm/flannel   "/flanneld --etcd-end"   55 minutes ago      Up 55 minutes                           k8s-flannel
ghost commented 7 years ago

thanks @luxas, that got etcd/flannel going so my cluster works now

  1. it seems I don't have the kubectl binary but that was mentioned in another thread I think
  2. if you're using a remote kubectl, what are the creds to interrogate the Apimangler on https://
  3. how do you figure out what port the Dashboard is NATed to?
luxas commented 7 years ago

The kubectl binary is downloadable from with

curl -sSL https://storage.googleapis.com/kubernetes-release/release/v1.2.0/bin/linux/arm/kubectl > /usr/local/bin/kubectl

To use remote kubectl, just set KUBERNETES_MASTER=http://{rpi-ip}:8080 or use -s http://{rpi-ip}:8080

For dashboard, visit http://{master-ip}:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

erikthorselius commented 7 years ago

I solved my problem with flannel by changing the start of docker-flannel.conf to

[Unit]
After=flannel.service
Requires=flannel.service

Don't know why it did not work before but now it works...

erikthorselius commented 7 years ago

After doing it on the rest of the cluster I realized I was wrong.

sudo systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled)
  Drop-In: /usr/lib/systemd/system/docker.service.d
           └─docker-flannel.conf
        /etc/systemd/system/docker.service.d
           └─overlay.conf
   Active: active (running) since Wed 2016-08-31 17:39:56 EEST; 17h ago
     Docs: https://docs.docker.com
  Process: 455 ExecStartPre=/bin/sh -c ifconfig docker0 down; brctl delbr docker0 (code=exited, status=1/FAILURE)
 Main PID: 471 (dockerd)
....
HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo rm /etc/systemd/system/docker.service.d/overlay.conf
HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo systemctl daemon-reload
suHypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:ff:18:2d:84
          inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:ffff:fe18:2d84/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:31 errors:0 dropped:0 overruns:0 frame:0
          TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2865 (2.7 KiB)  TX bytes:5462 (5.3 KiB)

HypriotOS/armv7: pirate@cluster-node02 in ~
$ sudo systemctl restart docker
HypriotOS/armv7: pirate@cluster-node02 in ~
$ ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:93:2b:eb:d7
          inet addr:10.1.86.1  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Now the network is handed over between docker and flannel.

bialad commented 7 years ago

Will the fixes from this issue be included in v0.8.0?

luxas commented 7 years ago

@bialad v0.8.0 will use my "official" code docker-multinode

You may test that also if you want to

bialad commented 7 years ago

@luxas I've actually been running my RPI cluster using docker-multinode so far, but since that repo doesn't use a release schedule I've had some issues with bugs since I'm always pulling the latest commit when booting my RPIs. I'd figure that I'd use this repo for setting up my core RPI cluster, and docker-multinode as a way to create amd64 worker nodes as vms in a windows server. Don't know if that's even possible, but time will tell. ;)

What do you mean by "my official" though? I've viewed this repo as a wrapper for kube-deploy, with stable releases. That's why I'm hopping for hypriot v1.0 stability to the v0.8.0 release.

ebagdasa commented 7 years ago

I didn't find brctl in .deb installation, should I install it manually? Same goes for wirte.sh

MathiasRenner commented 7 years ago

For me, after a reboot of a worker node, the routing is broken (all K8s containers, which have been running before the reboot, are up again -> fine).

Of the list in the first post of this thread, Flannel seems to be the only thing that is important for me (I don't mount anything, cgroup_enable=cpuset, docker pull works fine etc.). How do I get 2. and 3. implemented? ping @saturnism

@erikthorselius Where resides the docker-flannel.conf you mention? Can't find it on /etc/systemd/system/, and there's no flannel container running.

Here some logs:

saturnism commented 7 years ago

Ah, I think I know where some of the confusions w/ flannels are coming from. I was working w/ kubernetes on arm 0.7. Flannel configuration was here: https://github.com/luxas/kubernetes-on-arm/blob/release-0.7/sdcard/rootfs/kube-systemd/etc/kubernetes/dropins/docker-flannel.conf

I think it's changed since 0.8