Suggestion: use Kubernetes with GKE Autopilot instead of VMs to run book examples on a cloud GPU

This repo provides instructions on how to set up GCP cloud VM instance with GPU to run examples. I would like to recommend to take it further and use GKE Autopilot for GPU workloads instead of VMs. Some benefits are:

GKE Autopilot's pay-per-use model ensures cost efficiency. Applying workloads via kubectl apply is simple, and pod deletion when idle is effortless.
Leverage service-based load balancing to expose Jupyter Lab, eliminating the need for port forwarding.
Maintenance/upgrades are managed seamlessly by GKE Autopilot, freeing users from routine system upkeep.
Adopting Kubernetes, a scalable and industry-standard platform, equips readers with practical experience, setting them ahead of a docker compose on a VM setup.

This is how I deployed the examples to GKE Autopilot:

Build and deploy docker image:

IMAGE=<your_image> # you can also skip this step and use bulankou/gdl2:20230715 that I build
docker build -f ./docker/Dockerfile.gpu -t $IMAGE .
docker push $IMAGE .

Create GKE Autopilot cluster with all default settings.
Apply the following K8s manifest (kubectl apply -f <yaml>) . Make sure to update <IMAGE> below. Also note cloud.google.com/gke-accelerator: "nvidia-tesla-t4" and autopilot.gke.io/host-port-assignment annotation, that ensure that we pick the right node type as well as enable host port on Autopilot.

apiVersion: v1
kind: Pod
metadata:
  name: app
  annotations:
    autopilot.gke.io/host-port-assignment: '{"min":6006,"max":8888}'
  labels:
    service: app
spec:
  nodeSelector:
    cloud.google.com/gke-accelerator: "nvidia-tesla-t4"
  containers:
    - command: ["/bin/sh", "-c"]
      args: ["jupyter lab --ip 0.0.0.0 --port=8888 --no-browser --allow-root"]
      image: <IMAGE>
      name: app
      ports:
        - containerPort: 8888
          hostPort: 8888
        - containerPort: 6006
          hostPort: 6006
      resources:
        limits:
          nvidia.com/gpu: 1
        requests:
          cpu: "18"
          memory: "18Gi"
      tty: true
  restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: app
spec:
  type: LoadBalancer
  ports:
    - name: "8888"
      port: 8888
      targetPort: 8888
    - name: "6006"
      port: 6006
      targetPort: 6006
  selector:
    service: app

davidADSP / Generative_Deep_Learning_2nd_Edition

Suggestion: use Kubernetes with GKE Autopilot instead of VMs to run book examples on a cloud GPU #17