vigevenoj / k8sharkbait

Spins up a kubernetes cluster with applications on Linodes from scratch
3 stars 1 forks source link
ansible-playbooks glusterfs heketi linode python3

What is this?

Single-sentence description? Start a kubernetes cluster with a dedicated certificate-auth enabled etcd cluster, IPsec between nodes, and glusterfs on Linode by running a python script and a few ansible playbooks. That is probably more interesting to people than what I'm running on the end result.

The ansible directory contains the up.py script and playbooks used to bootstrap the environment.

This manifests directory contains configuration to run several applications using kubernetes, and to access them from the web. The applications are:

  1. Single-serving site "whatcolorischristinashair.com", the original purpose for this whole idea
  2. Mosquitto MQTT broker
  3. owntracks2db, for storing Owntracks location updates into a database

Prerequisites

Really, all you need are

  1. A Linode API token
  2. A computer with python 3 and ansible
  3. A private certificate authority, or the ability to generate some certificates since the preflight will give you virtual machines to run it on

Preflight

This is applicable to people who want to run a Kubernetes cluster on a group of virtual machines at Linode.

Cluster creation

You'll need a Linode account and a valid API token in addition to some other details in a configuration file that you should not check into source control. An example config.yaml:

linode:
  token: 
  # Use this to group all of the VMs used for your cluster
  group: fancy-birdbath
  region: us-east-1a
  # g5-standard-1 is 2048mb ram/1vcpu, $10/month
  # g5-nanode-1 is 1024mb ram/1vcpu, $5/month
  type: g5-standard-1
  distro: linode/ubuntu17.04
  root_pass: 
  root_ssh_key: 

# Put your kubernetes cluster information here, we'll use it later
cluster:
  # This is the number of nodes in the cluster, total (including master)
  nodes: 3
  # The cluster requires a domain that has DNS configured via Linode
  domain: example.com

With your config in place, run python up.py to generate your cluster nodes and ansible inventory as specified.

Next, you need use a certificate authority to generate certificates (and keys...) for each member of the cluster. Add those to ansible/roles/ipsec/files, along with the certificate of the certificate authority that signed them. I'm glossing over this step because I run a CA for personal projects, but you can use something like https://github.com/radiac/caman to do this step.

Cluster creation

Now run the playbooks:

  1. playbooks/minimal-bootstrap.yml: This must be run once, prior to any other playbooks, in order to ensure that every host has a unique machine id
  2. site.yml: This configures a baseline for each host, configures certificate-base IPsec transport between cluster members, builds an etcd cluster with certificate authentication on three hosts, and configures kubelet on the hosts which will be kubernetes nodes.
  3. kubeadm.yml: This bootstraps the kubernetes cluster using kubeam. Following this playbook, kubectl get nodes should return all hosts as nodes in Ready state
  4. storage.yml: This playbook labels the kubernetes nodes as storage nodes, creates a daemonset for glusterfs, and creates a Heketi cluster for volume management using the topology generated by the up.py script run initially.

Note that the playbooks require the controller (local machine) to have the python 'netaddr' package installed.

Inter-node traffic encryption

The ansible playbook configures certificate-based IPsec encapsulation of traffic between the nodes on their internal (private) addresses via the ipsec role. Certificates need to be generated ahead of time, and the chain of trust up to a root needs to be copied onto the nodes in addition to each node's certificate and key. There are some notes in the role with more details.

Storage

Persistent storage is managed via 20gb volumes attached to nodes as unformatted block devices. This is handled via some internal tooling during the preflight. It generates the necessary topology.json required for Heketi to use the volumes for glusterfs.
Once the cluster is boostrapped and Kubernetes is running, the storage playbook runs the gk-deploy script from gluster-kubernetes and heketi-cli from Heketi.

Once storage is online, create a StorageClass to fulfill prerequisite #3. storageclass.yaml will work for this, but requires the IP address of the Heketi service from kubernetes

Status Check

Service and Application Deployment

Deploy postgresql with helm:

helm install --name basic-database stable/postgresql

Deploy Traefik

First, update the ConfigMap in traefik-configmap.yaml to include the domains you plan on managing via Traefik, as SSL is obtained via Let's Encrypt.
Next, ensure that a persistent volume claim exists for Traefik to store its certificate information by applying the traefik-pvc.yaml:
kubectl apply -f traefik-pvc.yaml
Then apply the ConfigMap you updated previously
kubectl apply -f traefik-configmap.yaml
ensure that Traefik has the necessary roles assigned,
kubectl apply -f traefik-rbac.yaml

Create a deployment for the controller: kubectl apply -f traefik-ingress-controller_deployment.yaml

Deploy Traefik ui service

kubectl apply -f traefik-ui_service.yaml

Deploy whatcolorischristinashair

This is both a joke and the reason that this project exists. For additional details, see whatcolorischristinashair. The deployment and service are managed via haircolor.yaml and a Traefik ingress is managed via haircolor-ingress.yaml

kubectl apply -f haircolor.yaml
kubectl apply -f haircolor-ingress.yaml

Updating side-loaded images that are not publicly available

For images that are built privately and then side-loaded into the cluster, the following steps need to happen:

  1. Build the docker image with docker build
  2. Save the image locally with docker save
  3. scp the image to the cluster nodes
  4. Load the image into the cluster nodes' local registry with docker load

Future work

in no particular order

References

Notes

Adding basic auth for an ingress

Troubleshooting

Copying data from a gluster-backed volume

All gluster volumes have gone read-only

This is usually caused by an unplanned reboot of the physical hardware under the underlying host VM. Note that the gluster logs may only indicate some sort of networking problem (connection refused to the remote bricks making up the rest of the volume) but the problem is that the process that should be listening is not running on the remote hosts.

Certificates have expired

Don't let this happen. See https://github.com/kubernetes/kubeadm/issues/581#issuecomment-421477139 for a partial resolution. Rebooting the nodes after deploying new certificates to a 1.10 cluster may result in the nodes losing their keys, requiring either re-generation of keys or rebuilding the cluster. Check for Jun 26 05:32:45 robot-ghost-poop-2 kubelet[24333]: E0626 05:32:45.781780 24333 bootstrap.go:179] Unable to read existing bootstrap client config: invalid configuration: [unable to read client-cert /var/lib/kubelet/pki/kubelet-client.crt for default-auth due to open /var/lib/kubelet/pki/kubelet-client.crt: no such file or directory, unable to read client-key /var/lib/kubelet/pki/kubelet-client.key for default-auth due to open /var/lib/kubelet/pki/kubelet-client.key: no such file or directory] Jun 26 05:32:45 robot-ghost-poop-2 kubelet[24333]: F0626 05:32:45.821314 24333 server.go:233] failed to run Kubelet: cannot create certificate signing request: Unauthorized and, if seen, use kubeadm token create on the master node to create a token and kubeadm join using the token and whatever flags are required to successfully join the cluster (likely --ignore-preflight-errors=all to ignore existing kubernetes pki ca.crt, kubelet.conf, and bootstrap-kubelet.conf, no crictl to check container runtime, and swap enabled; and --discovery-token-unsafe-skip-ca-verification because the cluster is no longer using the previous certificate authority).

Relevant bug reports