astronomy-commons / genesis-jupyterhub-automator

When you need to quickly deploy a JupyterHub instance for tutorials, workshops, classes, and more.
MIT License
6 stars 2 forks source link

GPU instance support on AWS #17

Open mjuric opened 4 years ago

mjuric commented 4 years ago

Make sure we add GPU instance support for AWS deployments. This is a tracker issue for various pieces of this problem, and based on experiences with astroML demo prep.

Todo: [ ] Start the GPU nodes with a recommended AMI [ ] Patch the k8s deployment so EKS recognizes the GPU nodes (problem may have gone away by now) [ ] Deploy nvidia-device-plugin into the k8s cluster (helm chart) [ ] Start containers with the NVIDIA_DRIVER_CAPABILITIES: "all" environment variable [ ] Write a small script/utility to verify everything has been set up correctly and is working.

Add anything that's missing.

mjuric commented 4 years ago

@bsipocz @stevenstetzler Add missing issues, and/or solution information here so we don't forget it.

mjuric commented 4 years ago

GPU AMIs: https://docs.aws.amazon.com/eks/latest/userguide/gpu-ami.html

Nvidia device plugin: https://github.com/NVIDIA/k8s-device-plugin