malaysia-ai / jupyter-gpu

Jupyter Notebook with GPU and Code Server!
21 stars 2 forks source link

Jupter Notebook with GPU

Jupyter Notebook with GPU and Code Server!

Cloud environment

Current manifests only applicable for Azure Kubernetes Service and AWS EKS.

Why Kubernetes? Spot auto respawn!

Why domain is Because currently Malaysia-AI sponsored by !

Server access

The server is protected by Github Oauth.

  1. Request access at
  2. Once approved by, or, will give access to the server.

Training server

At, this server is to train the models and dataset preprocessing.

Currently we use

1 GPU Server


  1. 24 vCPU.
  2. 220 GB RAM.
  3. 1 A100 GPU 80GB VRAM.
  4. Spot based.

2 GPUs Server


  1. 48 vCPU.
  2. 440 GB RAM.
  3. 2 A100 GPUs 80GB VRAM.
  4. Spot based.

4 GPUs Server


  1. 96 vCPU.
  2. 880 GB RAM.
  3. 4 A100 GPUs 80GB VRAM.
  4. Spot based.

Serve server

At, this server is to serve the model using API and Chatbot interface.

  1. 24 vCPU.
  2. 220 GB RAM.
  3. 1 A100 GPU 80GB VRAM.
  4. Spot based.

You want more than this?

You can! If you have a good idea, like, Full Parameter Finetuning Multimodal Vision + Speech + Text, we can spawn more than 1 node 4x A100s, after that you can use Torch Distributed or Ray Cluster.

You do not want GPU, just big CPU and RAM?

You can! I know, deduping or distributed crawling or distributed something use a lot of CPU and RAM.

Auto restart script

Because the instance is spot based, so it can be killed any time (between 1 day - 6 days), so we have to prepare the script to auto respawn,

pm2 start "python3 /dir/"
pm2 save

Manual restart pod

Sometime GPU is not able detect for some reason, so we have to force restart the pod, to do it inside the pod,

kill -15 1

So this will kill Jupyter Notebook and force Kubernetes to restart the pod.

About GPU not able to detect, can

Jupyter proxy

If you run any webserver inside jupyter server,

from fastapi import FastAPI

app = FastAPI()

async def get():
    return 'hello'

import asyncio
import uvicorn

if __name__ == "__main__":
    config = uvicorn.Config(app)
    server = uvicorn.Server(config)
    await server.serve()

You can access the webserver at{port}/

VS code

Go to


How to create virtual environment

  1. Open new terminal using Jupyter Terminal or VS Code.
  2. Run these commands,
sudo apt install python3.10-venv -y
python3 -m venv my-env
~/my-env/bin/pip3 install wheel
~/my-env/bin/pip3 install ipykernel
~/my-env/bin/python3 -m ipykernel install --user --name=my-env
  1. Feel free to change my-env to any name.
  2. Go to Jupyter again, you should see your new virtual env,


  1. To install libraries,
~/my-env/bin/pip3 install library

In terminal or jupyter cell.


  1. Respect each others, do not kill someone else processes.
  2. Do not abuse for personal gains, eg, mining something.