geerlingguy / raspberry-pi-dramble

DEPRECATED - Raspberry Pi Kubernetes cluster that runs HA/HP Drupal 8
http://www.pidramble.com/
MIT License
1.67k stars 259 forks source link

Use Kubernetes? #100

Closed geerlingguy closed 6 years ago

geerlingguy commented 6 years ago

I've been wanting to get some real-world experience with Kubernetes (not just minikube locally, or a hosted solution like GCE or EKS), and what better way than to move my Raspberry Pi Dramble over to using k8s? This could be a fool's errand... or it could work great. I'll have to see, only one way to find out though!

geerlingguy commented 6 years ago

Ooh issue #100. Nice and even.

trinitronx commented 6 years ago

this is possible.

Since then, I have had no free time or enough unused Raspberry Pis to get it all working and set up. However, there are many posts online of people getting this working. You may want to check out these projects in no official order:

geerlingguy commented 6 years ago

Also: https://gist.github.com/alexellis/fdbc90de7691a1b9edb545c17da2d975

geerlingguy commented 6 years ago

Hmm, to install Docker on Raspbian, you currently have to use the 'convenience script'; see: https://docs.docker.com/install/linux/docker-ce/debian/#install-using-the-convenience-script

geerlingguy commented 6 years ago

Got Kubernetes installed, but kubelet startup is failing with:

May 23 10:44:43 kube1.pidramble.com kubelet[920]: unexpected fault address 0x15689500
May 23 10:44:43 kube1.pidramble.com kubelet[920]: fatal error: fault
May 23 10:44:43 kube1.pidramble.com kubelet[920]: [signal SIGSEGV: segmentation violation code=0x2 addr=0x15689500 pc=0x15689500]
May 23 10:44:43 kube1.pidramble.com kubelet[920]: goroutine 1 [running, locked to thread]:
May 23 10:44:43 kube1.pidramble.com kubelet[920]: runtime.throw(0x2a84a9e, 0x5)
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /usr/local/go/src/runtime/panic.go:605 +0x70 fp=0x15e2be98 sp=0x15e2be8c pc=0x3efa4
May 23 10:44:43 kube1.pidramble.com kubelet[920]: runtime.sigpanic()
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /usr/local/go/src/runtime/signal_unix.go:374 +0x1cc fp=0x15e2bebc sp=0x15e2be98 pc=0x5517c
May 23 10:44:43 kube1.pidramble.com kubelet[920]: k8s.io/kubernetes/vendor/github.com/appc/spec/schema/types.SemVer.Empty(...)
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /workspace/anago-v1.10.3-beta.0.74+2bba0127d85d5a/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/appc/spec/schema/types/semver.go:68
May 23 10:44:43 kube1.pidramble.com kubelet[920]: k8s.io/kubernetes/vendor/github.com/appc/spec/schema/types.NewSemVer(0x15816ec0, 0x20945b4, 0x2a8fbcf, 0xb, 0x15a76870)
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /workspace/anago-v1.10.3-beta.0.74+2bba0127d85d5a/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/appc/spec/schema/types/semver.go:41 +0x90 fp=0x
May 23 10:44:43 kube1.pidramble.com kubelet[920]: goroutine 5 [chan receive]:
May 23 10:44:43 kube1.pidramble.com kubelet[920]: k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x4551f48)
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /workspace/anago-v1.10.3-beta.0.74+2bba0127d85d5a/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/golang/glog/glog.go:879 +0x70
May 23 10:44:43 kube1.pidramble.com kubelet[920]: created by k8s.io/kubernetes/vendor/github.com/golang/glog.init.0
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /workspace/anago-v1.10.3-beta.0.74+2bba0127d85d5a/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/golang/glog/glog.go:410 +0x1a0
May 23 10:44:43 kube1.pidramble.com kubelet[920]: goroutine 69 [syscall]:
May 23 10:44:43 kube1.pidramble.com kubelet[920]: os/signal.signal_recv(0x2bd146c)
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /usr/local/go/src/runtime/sigqueue.go:131 +0x134
May 23 10:44:43 kube1.pidramble.com kubelet[920]: os/signal.loop()
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /usr/local/go/src/os/signal/signal_unix.go:22 +0x14
May 23 10:44:43 kube1.pidramble.com kubelet[920]: created by os/signal.init.0
May 23 10:44:43 kube1.pidramble.com kubelet[920]:         /usr/local/go/src/os/signal/signal_unix.go:28 +0x30
May 23 10:44:43 kube1.pidramble.com systemd[1]: kubelet.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
May 23 10:44:43 kube1.pidramble.com systemd[1]: kubelet.service: Unit entered failed state.
May 23 10:44:43 kube1.pidramble.com systemd[1]: kubelet.service: Failed with result 'exit-code'.
geerlingguy commented 6 years ago

See also: https://blog.hypriot.com/post/setup-kubernetes-raspberry-pi-cluster/

geerlingguy commented 6 years ago

Manually running the following command results in the same thing:

/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cadvisor-port=0 --rotate-certificates=true --cert-dir=/var/lib/kubelet/pki
geerlingguy commented 6 years ago

Link to line of code where it looks like there's an empty version being passed, causing this error: https://github.com/appc/spec/blob/master/schema/types/semver.go#L68

geerlingguy commented 6 years ago

When manually running kubeadm init ...:

[preflight] Running pre-flight checks.
    [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.05.0-ce. Max validated version: 17.03
    [WARNING KubeletVersion]: couldn't get kubelet version: exit status 2
    [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl

Got through that by running sudo kubeadm init --token-ttl=0 --apiserver-advertise-address=PI_IP_HERE --ignore-preflight-errors=all

(Following this setup guide, which seems to be about what my roles do anyways... https://gist.github.com/alexellis/fdbc90de7691a1b9edb545c17da2d975).

geerlingguy commented 6 years ago

And later in the init process:

[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
...
geerlingguy commented 6 years ago

Trying out an older version as per https://gist.github.com/alexellis/fdbc90de7691a1b9edb545c17da2d975#gistcomment-2595025

Might open an upstream issue, as it looks like nobody else has yet...

geerlingguy commented 6 years ago

Here are all the steps I've taken: https://gist.github.com/alexellis/fdbc90de7691a1b9edb545c17da2d975#gistcomment-2598596

Still not able to get fully functional K8s cluster though; I can't get the Flannel networking up and running, and there are plenty of errors from kubelet seen via journalctl -f.

geerlingguy commented 6 years ago

Disabled the firewall for now, and the only errors I'm seeing from kubelet are now:

May 23 22:10:08 kube1.pidramble.com kubelet[6843]: W0523 22:10:08.713818    6843 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
May 23 22:10:08 kube1.pidramble.com kubelet[6843]: E0523 22:10:08.714480    6843 kubelet.go:2125] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

And the main node is reporting 'NotReady' for it's status currently:

# kubectl get nodes
NAME                  STATUS     ROLES     AGE       VERSION
kube1.pidramble.com   NotReady   <none>    19m       v1.10.2
geerlingguy commented 6 years ago

So a quick fix for that is doing the following (from this comment: https://github.com/kubernetes/kubernetes/issues/43815#issuecomment-290235245):

  1. Edit /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

  2. Comment the existing KUBELET_NETWORK_ARGS line and add a line right below it:

     Environment="KUBELET_NETWORK_ARGS="
  3. Reload config and restart kubelet:

     sudo systemctl daemon-reload
     sudo systemctl restart kubelet

However, later in that same issue, this summary (https://github.com/kubernetes/kubernetes/issues/43815#issuecomment-317501985), it says you need to install a pod network... which I did using Flannel, but it seems that might not be working on the Pi like it did in my local Vagrant test rig :(

geerlingguy commented 6 years ago

Testing with a 'hello world' worked!

kubectl run pi --image=perl --restart=OnFailure -- perl -Mbignum=bpi -wle 'print bpi(2)'

Then after a couple minutes (and a lot of Pi CPU usage and iowait! I'm guessing running things off a not-microSD-card would be waaaaay faster):

# kubectl describe jobs/pi
Name:           pi
Namespace:      default
Selector:       controller-uid=8e3ca95c-5f6b-11e8-b8e4-b827ebcd0930
Labels:         run=pi
Annotations:    <none>
Parallelism:    1
Completions:    1
Start Time:     Thu, 24 May 2018 16:00:21 +0000
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=8e3ca95c-5f6b-11e8-b8e4-b827ebcd0930
           job-name=pi
           run=pi
  Containers:
   pi:
    Image:      perl
    Port:       <none>
    Host Port:  <none>
    Args:
      perl
      -Mbignum=bpi
      -wle
      print bpi(2)
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  3m    job-controller  Created pod: pi-vzf8k

# kubectl logs pi-vzf8k
3.1
geerlingguy commented 6 years ago

Also getting a lot of:

May 24 16:09:42 kube1.pidramble.com kernel: Under-voltage detected! (0x00050005)

So I might want to switch to a dedicated 2.4A power supply (right now I'm using my multi-port PowerAdd supply... which has been pretty stable normally, but might not be able to supply the full brunt needed to run the Pi 3 B+ under heavy CPU and I/O load!

geerlingguy commented 6 years ago

Things seem stable after a restart as well; though every time I reboot it takes a long time (~5 minutes) before the master node reports Ready status, and while that's happening all the kubelet requests (as well as things like kubectl get nodes) fail with:

Unable to connect to the server: net/http: TLS handshake timeout

See related: https://github.com/Azure/AKS/issues/112 (though that could be something entirely different... it seems to happen after major changes or upgrades, so basically it seems Kubernetes' plumbing has to rejigger all the TLS stuff on any reboot or upgrade maybe.

geerlingguy commented 6 years ago

Yay!

drupal-8-pi-kubernetes

Single node cluster for now (one master)... I'll do a few quick benchmarks, then later today or this week I'll look into adding a few more of my nodes. Right now I just wanted to get it all reproducible and runnable!

geerlingguy commented 6 years ago

Dug up an upstream issue: https://github.com/ansible/ansible/issues/40684

geerlingguy commented 6 years ago

Using benchmarks from this page: http://www.pidramble.com/wiki/benchmarks/drupal

(Compare this to the Single Pi benchmarks, which I just ran with the old Drupal Pi stack a couple months ago—149 and 15.32 req/s, respectively.)

So not bad... it's almost 50% overhead to run on a single node with Kubernetes master on that node. I want to see what happens if I have a 2nd node, 3rd node, etc, along with some horizontal pod autoscaling. I'll be working on that next, but going to pause for a bit since I have a 100% functional setup at this point (see the kubernetes branch in this repo, or PR #102, for current progress).

Update: Tested with 5 node cluster, Drupal pod on kube2, MySQL pod on kube5. Note that I have MySQL's PV affinity set to stick to kube5, but technically I could scale Drupal using a replicaset... however, the official library drupal container is not currently configured for running a live site correctly. I'll need to work on building a proper Drupal image + codebase as part of the overall playbook first...

Here are the results with kube1 being master (no pods), kube2 running Drupal, kube5 running MySQL:

Weird that anonymous was so much lower. I wonder if Flannel networking could be causing a bit of an issue somehow with latency? I'm hitting the server IP directly and accessing using the NodePort, so it's not even communicating across hosts...

geerlingguy commented 6 years ago

Ha, now I'm getting errors because one of Raspbian's mirrors is down:

TASK [Ensure dependencies are installed.] ******************************************************************************
failed: [10.0.100.44] (item=[u'sudo', u'openssh-server']) => changed=false 
  cmd: apt-get install python-apt -y -q
  item:
  - sudo
  - openssh-server
  msg: |-
    E: Failed to fetch http://mirror.glennmcgurrin.com/raspbian/pool/main/g/gnupg2/dirmngr_2.1.18-8~deb9u1_armhf.deb  Something wicked happened resolving 'mirror.glennmcgurrin.com:http' (-5 - No address associated with hostname)
    E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
  rc: 100
  stderr: |-
    E: Failed to fetch http://mirror.glennmcgurrin.com/raspbian/pool/main/g/gnupg2/dirmngr_2.1.18-8~deb9u1_armhf.deb  Something wicked happened resolving 'mirror.glennmcgurrin.com:http' (-5 - No address associated with hostname)
    E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
  stderr_lines:
  - 'E: Failed to fetch http://mirror.glennmcgurrin.com/raspbian/pool/main/g/gnupg2/dirmngr_2.1.18-8~deb9u1_armhf.deb  Something wicked happened resolving ''mirror.glennmcgurrin.com:http'' (-5 - No address associated with hostname)'
  - 'E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?'
  stdout: |-
    Reading package lists...
    Building dependency tree...
    Reading state information...
    The following additional packages will be installed:
      dirmngr
    Suggested packages:
      dbus-user-session pinentry-gnome3 tor python-apt-dbg python-apt-doc
    The following NEW packages will be installed:
      dirmngr python-apt
    0 upgraded, 2 newly installed, 0 to remove and 8 not upgraded.
    Need to get 703 kB of archives.
    After this operation, 1526 kB of additional disk space will be used.
    Err:1 http://mirror.glennmcgurrin.com/raspbian stretch/main armhf dirmngr armhf 2.1.18-8~deb9u1
      Something wicked happened resolving 'mirror.glennmcgurrin.com:http' (-5 - No address associated with hostname)
    Get:2 http://mirror.glennmcgurrin.com/raspbian stretch/main armhf python-apt armhf 1.1.0~beta5 [157 kB]
    Fetched 157 kB in 1s (91.5 kB/s)
  stdout_lines: <omitted>

Had to sudo nano /etc/apt/sources.list, paste in a different mirror from the Raspbian mirrors list—I chose deb http://reflection.oss.ou.edu/raspbian/raspbian/ stretch main contrib non-free rpi—then sudo apt-get update on the Pi to get the new mirror to be used.

geerlingguy commented 6 years ago

Aha! To get Flannel working... I had to:

curl -sSL "https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml?raw=true" | sed "s/amd64/arm/g" | kubectl create -f -

(Basically, download the flannel.yml file, and replace all occurrences of amd64 with arm, then apply that.)

See: https://github.com/coreos/flannel/issues/663#issuecomment-299593569

geerlingguy commented 6 years ago

Adding a task for now:

    # TODO: See https://github.com/coreos/flannel/issues/663
    - name: Apply Drupal 8 Kubernetes services to the cluster.
      shell: >
        curl -sSL "https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml?raw=true" |
        sed "s/amd64/arm/g" |
        kubectl apply -f -
      register: flannel_result
      changed_when: "'created' in flannel_result.stdout"
      run_once: True
geerlingguy commented 6 years ago

So, some things to continue working on:

  1. PV for Drupal codebase.
  2. Set up Drupal codebase (maybe using composer project?) in the PV (so it doesn't go away when the Pod is cycled).
  3. Make sure we have Drush as part of the Drupal codebase (so it could technically be used with kubectl exec.
geerlingguy commented 6 years ago

Another project to refer people to, by the wonderful @chris-short: https://rak8s.io / https://github.com/rak8s/rak8s

geerlingguy commented 6 years ago

Well that was easy enough...

root@kube1:/home/pi# kubectl get nodes
NAME                  STATUS    ROLES     AGE       VERSION
kube1.pidramble.com   Ready     master    21h       v1.10.2
kube2.pidramble.com   Ready     <none>    10m       v1.10.2

And to test that it's working, I ran the perl pi job again:

root@kube1:/home/pi# kubectl get pods -o wide
NAME                             READY     STATUS              RESTARTS   AGE       IP           NODE
drupal8-5cbd76cb5b-k8kn8         1/1       Running             1          21h       10.244.0.5   kube1.pidramble.com
drupal8-mysql-788d8dd84b-75kdt   1/1       Running             1          21h       10.244.0.6   kube1.pidramble.com
pi-lp26j                         0/1       ContainerCreating   0          8s        <none>       kube2.pidramble.com

I also killed the kube2 node and restarted it... worked well:

root@kube1:/mnt/nfs# kubectl get nodes
NAME                  STATUS     ROLES     AGE       VERSION
kube1.pidramble.com   Ready      master    22h       v1.10.2
kube2.pidramble.com   NotReady   <none>    23m       v1.10.2
root@kube1:/mnt/nfs# kubectl get nodes
NAME                  STATUS    ROLES     AGE       VERSION
kube1.pidramble.com   Ready     master    22h       v1.10.2
kube2.pidramble.com   Ready     <none>    25m       v1.10.2

Note that the Pi kept re-enabling swap even after I killed the service and swap multiple times. It seems the Pi is determined to keep some swap space available (99m in this case) after a reboot no matter what!

geerlingguy commented 6 years ago

So, first hurdle when pods get spread out...

root@kube1:~# kubectl get pods -o wide
NAME                             READY     STATUS    RESTARTS   AGE       IP           NODE
drupal8-5cbd76cb5b-lz7dx         1/1       Running   0          18m       10.244.1.3   kube2.pidramble.com
drupal8-mysql-788d8dd84b-hrfln   1/1       Running   0          18m       10.244.0.8   kube1.pidramble.com

When I try installing Drupal, I get the error from the Drupal installer:

SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution.

I thought the hostname drupal8-mysql would still work even if the pod is on a different node. Also it's slightly annoying that NodePorts are kind of dynamic—you have to use the node's external IP with the NodePort, there's no magical routing from any other node to the node where the service's NodePort resides... I'm going to have to deep dive into some Kubernetes/Flannel networking and DNS docs!

geerlingguy commented 6 years ago

Testing on the full stack now... Just ran out to Micro Center to grab 5 Pi model 3 B+'s to replace the model 3's in my existing cluster. We'll see how it goes—the playbook's running now!

geerlingguy commented 6 years ago

Woot, first try!

root@kube1:/home/pi# kubectl get nodes
NAME                  STATUS    ROLES     AGE       VERSION
kube1.pidramble.com   Ready     master    28m       v1.10.2
kube2.pidramble.com   Ready     <none>    28m       v1.10.2
kube3.pidramble.com   Ready     <none>    28m       v1.10.2
kube4.pidramble.com   Ready     <none>    28m       v1.10.2
kube5.pidramble.com   Ready     <none>    28m       v1.10.2

root@kube1:/home/pi# kubectl get services
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
drupal8         NodePort    10.105.7.230   <none>        80:31746/TCP   27m
drupal8-mysql   ClusterIP   None           <none>        3306/TCP       27m
kubernetes      ClusterIP   10.96.0.1      <none>        443/TCP        29m

NAME                             READY     STATUS    RESTARTS   AGE       IP           NODE
drupal8-5cbd76cb5b-g5bw9         1/1       Running   0          27m       10.244.3.2   kube5.pidramble.com
drupal8-mysql-788d8dd84b-xs975   0/1       Pending   0          27m       <none>       <none>

Looks like MySQL isn't too happy, unfortunately. But I've also switched the master to not run pods for now, so it preserves performance for Kubernetes/kubelet.

Some more on the MySQL issue:

root@kube1:/home/pi# kubectl describe pod drupal8-mysql-788d8dd84b-xs975
Name:           drupal8-mysql-788d8dd84b-xs975
Namespace:      default
Node:           <none>
Labels:         app=drupal8
                pod-template-hash=3448488406
                tier=mysql
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/drupal8-mysql-788d8dd84b
Containers:
  mysql:
    Image:      hypriot/rpi-mysql:5.5
    Port:       3306/TCP
    Host Port:  0/TCP
    Environment:
      MYSQL_DATABASE:       drupal
      MYSQL_USER:           drupal
      MYSQL_PASSWORD:       <set to the key 'password' in secret 'drupal8-mysql-pass'>       Optional: false
      MYSQL_ROOT_PASSWORD:  <set to the key 'password' in secret 'drupal8-mysql-root-pass'>  Optional: false
    Mounts:
      /var/lib/mysql from mysql-persistent-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s9mvh (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  mysql-persistent-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  drupal8-mysql
    ReadOnly:   false
  default-token-s9mvh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-s9mvh
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  28m                default-scheduler  pod has unbound PersistentVolumeClaims (repeated 3 times)
  Warning  FailedScheduling  28m (x3 over 28m)  default-scheduler  0/5 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 1 node(s) were not ready, 3 node(s) had volume node affinity conflict.
  Warning  FailedScheduling  28m (x3 over 28m)  default-scheduler  0/5 nodes are available: 5 node(s) were not ready.
  Warning  FailedScheduling  3m (x88 over 28m)  default-scheduler  0/5 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 4 node(s) had volume node affinity conflict.

See:

But this one looks a little... interesting. Maybe I need to find a way to integrate the NFS storage with K8s instead of hacking around using local-storage, because I'll definitely get burned in the latter case.

Or maybe add some sort of affinity towards kube5 or something for MySQL? Would that do the trick? Isn't Kubernetes just supposed to be magic? ;)

geerlingguy commented 6 years ago

It was the node affinity, for sure; updated to point it at kube5.pidramble.com and that worked.

Also looking into NFS-based volumes, and it looks like if I:

Then I can use PVCs to mount things in containers for persistence (e.g. if I need multiple Drupal site pods hitting one shared files dir, or multiple Drupal sites (multisite or otherwise) hitting different shared files dirs). See more: Using Persistent Volumes on bare metal.

geerlingguy commented 6 years ago

Also annoying... every reboot I have to disable swap again before kubelet will start happily... and in desperation, I am now asking on the RPi stack exchange site, How to permanently disable swap on Raspbian Stretch Lite

geerlingguy commented 6 years ago

Got the swap thing sorted; I needed to use shell instead of command because I had &&s, forgot about that, oops!

I think I might bless the 'official' K8s branch, so I can file multiple issues instead of this one giant one. Closing this out as the PoC is working and I'm pretty pleased with how it turned out. Next step is adding a few issues to work on things like ingress controller and a proper Drupal installation.