This file contains the files from my University Semester 3 Personal project, which is to work with Kubernetes to understand how it can be used for scaling AI services.
The project revolves around the below:
The repository is organized into the following folders:
local
The local
folder contains scripts, configurations, and resources needed to run AI workloads on your local machine. This includes:
gcp
The gcp
folder contains Terraform configurations and Kubernetes manifests for deploying AI workloads to Google Cloud Platform (GCP). This includes:
Each of these folders contains numbered folders containging different attempts.
Each will contain a readme.md
file explaining the contents of that directory.
Minikube is local Kubernetes, focusing on making it easy to learn and develop for Kubernetes.
To run a Docker container in Minikube, follow these steps:
Set the CPU and memory for Minikube:
minikube config set cpus 8
minikube config set memory 16384
Start Minikube with GPU support and Docker driver:
minikube start --driver=docker --gpus=all
Build the Docker image if it is not already:
docker build -t hf-gpu .
Load the Docker image into Minikube:
minikube image load hf-gpu
Mount a local directory containing the models to Minikube:
minikube mount C:\code\S3P\models\HF-Phi:/models
This is best done on a new terminal window as it must stay active while mounted.
[!NOTE]
Additional config is required in the Kubernetes deployment.yaml file to mount this to the pod.
Apply the Kubernetes deployment and service configuration:
kubectl apply -f deploy.yaml
kubectl apply -f service.yaml
Use Minikube to access the service:
minikube service ai-service
To access the logs of the running pod:
kubectl logs <pod-name>
Replace <pod-name>
with the name of your pod. You can get the pod name by running:
kubectl get pods
When you are done, you can delete the Minikube cluster:
minikube delete
By following these steps, you can run your Docker container in Minikube with GPU support and access it through a Kubernetes service.
Terraform is an infrastructure as code tool that enables you to safely and predictably provision and manage infrastructure in any cloud.
Initialise Terraform:
terraform init
Configure Terraform:
Set up your Google Cloud credentials using gcloud auth configure
. Ensure the project
and region
variables in the provider "google"
block are set correctly. For example:
gcloud container clusters get-credentials gke-hf-phi --region europe-west3
Plan and Apply Infrastructure:
terraform plan
terraform apply
terraform init
terraform plan
terraform apply
terraform destroy
kubectl
is a command-line tool used to communicate with Kubernetes clusters. It allows you to manage and inspect cluster resources, such as pods, services, deployments, and more.
To interact with your GKE cluster, you'll need to authenticate your kubectl
client. This can be done using gcloud
or by setting up service account authentication.
gcloud container clusters get-credentials gke-hf-phi --region europe-west3
kubectl
Commandskubectl get pods
kubectl get services
kubectl get deployments
kubectl get nodes
kubectl describe pod <pod-name>
kubectl describe service <service-name>
kubectl describe deployment <deployment-name>
kubectl describe node <node-name>
kubectl create deployment my-deployment --image=my-image
kubectl delete pod <pod-name>
kubectl delete service <service-name>
kubectl delete deployment <deployment-name>
kubectl apply -f my-manifest.yaml
This command will configure your kubectl client to use the correct credentials to interact with your cluster.
gcloud compute machine-types list --zones=europe-west4-a
gcloud compute accelerator-types list --filter="zone:( europe-west4-b )"
gcloud compute tpus accelerator-types list --zone=europe-west4-a