kubeflow / pytorch-operator

PyTorch on Kubernetes
Apache License 2.0
306 stars 143 forks source link

Pytorch Docker image pytorch/pytorch:1.2-cuda10.0-cudnn7-runtime does not have cuda so unable to use GPU #245

Open MATRIX4284 opened 4 years ago

MATRIX4284 commented 4 years ago

the pytorcg docker image pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime used in examples/mnist Dockerfile cannot use GPU for mnist.py eaxmple always giving cuda as False.

This is because the cuda is not installed properly

kubectl exec -it pytorch-dist-mnist-gloo-worker-0 -n default -- /bin/bash which cuda will give no results.

We have to use an appropriate Docker image which is capable of running the mnist usig GPU

MATRIX4284 commented 4 years ago

PR #248 is the resolution