Single-sentence description? Start a kubernetes cluster with a dedicated certificate-auth enabled etcd cluster, IPsec between nodes, and glusterfs on Linode by running a python script and a few ansible playbooks. That is probably more interesting to people than what I'm running on the end result.
The ansible directory contains the up.py
script and playbooks used to bootstrap the environment.
This manifests directory contains configuration to run several applications using kubernetes, and to access them from the web. The applications are:
Really, all you need are
This is applicable to people who want to run a Kubernetes cluster on a group of virtual machines at Linode.
You'll need a Linode account and a valid API token in addition to some other details in a configuration file that you should not check into source control. An example config.yaml:
linode:
token:
# Use this to group all of the VMs used for your cluster
group: fancy-birdbath
region: us-east-1a
# g5-standard-1 is 2048mb ram/1vcpu, $10/month
# g5-nanode-1 is 1024mb ram/1vcpu, $5/month
type: g5-standard-1
distro: linode/ubuntu17.04
root_pass:
root_ssh_key:
# Put your kubernetes cluster information here, we'll use it later
cluster:
# This is the number of nodes in the cluster, total (including master)
nodes: 3
# The cluster requires a domain that has DNS configured via Linode
domain: example.com
With your config in place, run python up.py
to generate your cluster nodes and ansible inventory as specified.
Next, you need use a certificate authority to generate certificates (and keys...) for each member of the cluster. Add those to ansible/roles/ipsec/files, along with the certificate of the certificate authority that signed them. I'm glossing over this step because I run a CA for personal projects, but you can use something like https://github.com/radiac/caman to do this step.
Now run the playbooks:
kubectl get nodes
should return all hosts as nodes in Ready stateNote that the playbooks require the controller (local machine) to have the python 'netaddr' package installed.
The ansible playbook configures certificate-based IPsec encapsulation of traffic between the nodes on their internal (private) addresses via the ipsec
role. Certificates need to be generated ahead of time, and the chain of trust up to a root needs to be copied onto the nodes in addition to each node's certificate and key. There are some notes in the role with more details.
Persistent storage is managed via 20gb volumes attached to nodes as unformatted block devices. This is handled via some internal tooling during the preflight. It generates the necessary topology.json required for Heketi to use the volumes for glusterfs.
Once the cluster is boostrapped and Kubernetes is running, the storage
playbook runs the gk-deploy script from gluster-kubernetes and heketi-cli from Heketi.
Once storage is online, create a StorageClass to fulfill prerequisite #3. storageclass.yaml will work for this, but requires the IP address of the Heketi service from kubernetes
helm install --name basic-database stable/postgresql
First, update the ConfigMap in traefik-configmap.yaml to include the domains you plan on managing via Traefik, as SSL is obtained via Let's Encrypt.
Next, ensure that a persistent volume claim exists for Traefik to store its certificate information by applying the traefik-pvc.yaml:
kubectl apply -f traefik-pvc.yaml
Then apply the ConfigMap you updated previously
kubectl apply -f traefik-configmap.yaml
ensure that Traefik has the necessary roles assigned,
kubectl apply -f traefik-rbac.yaml
Create a deployment for the controller:
kubectl apply -f traefik-ingress-controller_deployment.yaml
kubectl apply -f traefik-ui_service.yaml
This is both a joke and the reason that this project exists. For additional details, see whatcolorischristinashair. The deployment and service are managed via haircolor.yaml and a Traefik ingress is managed via haircolor-ingress.yaml
kubectl apply -f haircolor.yaml
kubectl apply -f haircolor-ingress.yaml
For images that are built privately and then side-loaded into the cluster, the following steps need to happen:
docker build
docker save
docker load
in no particular order
htpasswd -Bc ingressname.auth.secret username
kubectl create secret generic ingressname-basic-auth --from-file ingressname.auth.secret
kubectl get pv $(kubectl get pvc pvcname -o jsonpath='{.spec.volumeName}{"\n"}') -o jsonpath='{.spec.glusterfs.path}{"\n"}'
heketi-cli
API or gluster
command (via kubectl exec
on a glusterfs pod) to determine name of the brick where the data resides and which host that brick is on.lvdisplay
to find the path of the LV where the brick islvdisplay
somewhere, eg, /mount/mybrickThis is usually caused by an unplanned reboot of the physical hardware under the underlying host VM. Note that the gluster logs may only indicate some sort of networking problem (connection refused to the remote bricks making up the rest of the volume) but the problem is that the process that should be listening is not running on the remote hosts.
ps aux | grep gluster
Don't let this happen. See https://github.com/kubernetes/kubeadm/issues/581#issuecomment-421477139 for a partial resolution. Rebooting the nodes after deploying new certificates to a 1.10 cluster may result in the nodes losing their keys, requiring either re-generation of keys or rebuilding the cluster. Check for Jun 26 05:32:45 robot-ghost-poop-2 kubelet[24333]: E0626 05:32:45.781780 24333 bootstrap.go:179] Unable to read existing bootstrap client config: invalid configuration: [unable to read client-cert /var/lib/kubelet/pki/kubelet-client.crt for default-auth due to open /var/lib/kubelet/pki/kubelet-client.crt: no such file or directory, unable to read client-key /var/lib/kubelet/pki/kubelet-client.key for default-auth due to open /var/lib/kubelet/pki/kubelet-client.key: no such file or directory] Jun 26 05:32:45 robot-ghost-poop-2 kubelet[24333]: F0626 05:32:45.821314 24333 server.go:233] failed to run Kubelet: cannot create certificate signing request: Unauthorized
and, if seen, use kubeadm token create
on the master node to create a token and kubeadm join
using the token and whatever flags are required to successfully join the cluster (likely --ignore-preflight-errors=all
to ignore existing kubernetes pki ca.crt, kubelet.conf, and bootstrap-kubelet.conf, no crictl to check container runtime, and swap enabled; and --discovery-token-unsafe-skip-ca-verification
because the cluster is no longer using the previous certificate authority).
fusermount -uz [path to mount]
kubectl scale --replicas=0 deploy/heketi
, wait a few minutes, and then scale the deployment back to 1 replica with kubectl scale --replicas=1 deploy/heketi
and then watch its logs to confirm Heketi started up in read/write mode. See https://github.com/gluster/gluster-kubernetes/issues/257 for some additional details