spotty-cloud / spotty

Training deep learning models on AWS and GCP instances
https://spotty.cloud
MIT License
491 stars 43 forks source link

pytorch 1.9 with GCP GPUs? #104

Closed turian closed 3 years ago

turian commented 3 years ago

How can I install pytorch 1.9 on GCP GPUs?

It appears the only machine image they provide is CUDA 11.0 (!): https://console.cloud.google.com/compute/images?project=hear2021-evaluation

But, pytorch 1.9.0 builds are only against 11.1 and 10.2. CUDA 11.0 is supported only through pytorch 1.7.1.

1) Can I use CUDA 10.2 in the Docker, even though the bare metal system is CUDA 11.0? Or will that cause problems. I have seen conflicting advice on this. 2) I could try to build my own Google machine image. This seems very painful tho. 3) I could try to build pytorch 1.9 from scratch in my Docker and use CUDA 11.0 in my Docker. I haven’t found good Dockerfiles explaining how to do this.

Any other suggestions?

apls777 commented 3 years ago

@turian You don't need to have CUDA installed on the host OS at all, it should be a part of the Docker image. So it doesn't matter what version is currently installed on the GCP image, Docker environment doesn't have access to it anyway.

Just build your Docker image based on pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime or nvidia/cuda:11.1.1-cudnn8-runtime-ubuntu18.04, for example.

turian commented 3 years ago

@apls777 interesting! I was not aware of that at all