Semester 3 Personal Project

This file contains the files from my University Semester 3 Personal project, which is to work with Kubernetes to understand how it can be used for scaling AI services.

The project revolves around the below:

Running AI workloads locally
Running AI workloads in Kubernees
Using Terraform (IaC) to deploy enviroments to Google Kubernetes Engine (GKE)
Using Terraform to deploy AI workloads to GKE
Autoscaling AI workloads in GKE

Repository Structure

The repository is organized into the following folders:

`local`

The local folder contains scripts, configurations, and resources needed to run AI workloads on your local machine. This includes:

Dockerfiles for building local AI service images
Kubernetes manifests for local deployment
Scripts for setting up and running AI services locally

`gcp`

The gcp folder contains Terraform configurations and Kubernetes manifests for deploying AI workloads to Google Cloud Platform (GCP). This includes:

Terraform scripts for provisioning GKE clusters and other necessary infrastructure
Kubernetes manifests for deploying AI services to GKE
Configuration files for autoscaling AI workloads in GKE

Each of these folders contains numbered folders containging different attempts. Each will contain a readme.md file explaining the contents of that directory.

Useful Commands

Minikube

Minikube is local Kubernetes, focusing on making it easy to learn and develop for Kubernetes.

To run a Docker container in Minikube, follow these steps:

Configure and Start Minikube

Set the CPU and memory for Minikube:

minikube config set cpus 8
minikube config set memory 16384

Start Minikube with GPU support and Docker driver:
```
minikube start --driver=docker --gpus=all
```

Build and Load the Docker Image

Build the Docker image if it is not already:
```
docker build -t hf-gpu .
```
Load the Docker image into Minikube:
```
minikube image load hf-gpu
```

Mount Local Directory

Mount a local directory containing the models to Minikube:

minikube mount C:\code\S3P\models\HF-Phi:/models

This is best done on a new terminal window as it must stay active while mounted.

[!NOTE]
Additional config is required in the Kubernetes deployment.yaml file to mount this to the pod.

Deploy to Minikube

Apply the Kubernetes deployment and service configuration:

kubectl apply -f deploy.yaml
kubectl apply -f service.yaml

Access the Service

Use Minikube to access the service:

minikube service ai-service

Access Logs

To access the logs of the running pod:

kubectl logs <pod-name>

Replace <pod-name> with the name of your pod. You can get the pod name by running:

kubectl get pods

Clean Up

When you are done, you can delete the Minikube cluster:

minikube delete

By following these steps, you can run your Docker container in Minikube with GPU support and access it through a Kubernetes service.

Terraform

Terraform is an infrastructure as code tool that enables you to safely and predictably provision and manage infrastructure in any cloud.

Prerequisites:

A Google Cloud Platform (GCP) project with billing enabled.
A Google Cloud Platform project with the necessary API permissions.
Terraform installed and configured.

Setup:

Initialise Terraform:
```
terraform init
```
Configure Terraform:

Set up your Google Cloud credentials using gcloud auth configure. Ensure the project and region variables in the provider "google" block are set correctly. For example:
```
gcloud container clusters get-credentials gke-hf-phi --region europe-west3
```
Plan and Apply Infrastructure:
```
terraform plan
terraform apply
```

Commands:

Initialise Terraform:
```
terraform init
```
Plan the Infrastructure:
```
terraform plan
```
Apply the Infrastructure:
```
terraform apply
```
Destroy the Infrastructure:
```
terraform destroy
```

Kubectl

kubectl is a command-line tool used to communicate with Kubernetes clusters. It allows you to manage and inspect cluster resources, such as pods, services, deployments, and more.

Authenticating with GKE

To interact with your GKE cluster, you'll need to authenticate your kubectl client. This can be done using gcloud or by setting up service account authentication.

gcloud container clusters get-credentials gke-hf-phi --region europe-west3

Common `kubectl` Commands

Get Resources:

kubectl get pods
kubectl get services
kubectl get deployments
kubectl get nodes

Describe Resources:

kubectl describe pod <pod-name>
kubectl describe service <service-name>
kubectl describe deployment <deployment-name>
kubectl describe node <node-name>

Create Resources:

kubectl create deployment my-deployment --image=my-image

Delete Resources:

kubectl delete pod <pod-name>
kubectl delete service <service-name>
kubectl delete deployment <deployment-name>

Apply Configuration:

kubectl apply -f my-manifest.yaml

This command will configure your kubectl client to use the correct credentials to interact with your cluster.

Google Cloud Platform

See which machine types are available in a zone

gcloud compute machine-types list --zones=europe-west4-a

See which accelerator types are available in a Zone

gcloud compute accelerator-types list --filter="zone:( europe-west4-b )"

See which TPUs are available in a zone

gcloud compute tpus accelerator-types list --zone=europe-west4-a

lmath56 / S3P

readme