Investigate how to use GPU on CI

informatics-lab / aml-jupyterhub

Code work and experiments to integrate Azure Machine Learning with JupyterHub.

6 stars 4 forks source link

Investigate how to use GPU on CI #36

Closed nbarlowATI closed 4 years ago

nbarlowATI commented 4 years ago

Deploying a GPU-enabled VM size (e.g. Standard_NC6) doesn't seem to be enough to use GPU - it's not clear that the cuda drivers are installed:

python
>>> import torch
>>> torch.cuda.is_availabe()
False

The documentation suggests that GPU support should work out-of-the-box though, and there may be some hints in: https://azure.microsoft.com/en-gb/blog/azure-machine-learning-service-now-supports-nvidia-s-rapids/ but this appears to use a compute cluster as the compute target.

To be investigated.

tam203 commented 4 years ago

@nbarlowATI - I suggest you raise a support request through the Azure portal. https://portal.azure.com/#home -> Machine learning, chose the workspace, select 'new support request'. This feels to me a flaw/bug compared to what's advertised.

nbarlowATI commented 4 years ago

This was a False alarm caused by my mistake - checking in the AzureML Studio, it turns out the CI I was running on was a Standard DS1 (which had been created previously with the same name, due to another bug (fixed in commit 0c63e1a132068bec4e6e10a0fd4102da3275d7e6 ). Deleting that CI, and rerunning, I can confirm that

import torch
torch.cuda.is_available()

returns True when running on a "GPU" VM (which is a "Standard_NC6")