UNM-CARC / QuickBytes

Development of short tutorials for UNM's Center for Advanced Research Computing
18 stars 16 forks source link

Update Install deep learning packages.md #114

Closed ktagen-sudo closed 2 years ago

ktagen-sudo commented 2 years ago

Pytorch+GPU silently fails with the previous tutorial. It means that with a simple command similar to this "torch.cuda.is_available()" it will show the GPU. It would even "allocate" a tensor to the device with a command similar to this "y = torch.tensor([1,4,9]).to(device)". However, when doing more advanced commands such as ".forward()" or even matrices operations it would print an error similar to this "RuntimeError: CUDA error: no kernel image is available for execution on the device" . The following command above fixes the issue. The pip command installed pytorch in this directory "/users/kfotso/.conda/envs/compat_gpu/lib/python3.7/site-packages/" .

More info here: --> https://blog.nelsonliu.me/2020/10/13/newer-pytorch-binaries-for-older-gpus/ and here https://github.com/pytorch/pytorch/issues/30532 . Package can be found here: --> https://github.com/nelson-liu/pytorch-manylinux-binaries/releases I just downloaded the latest version from there.

################################# Below is the proof

import torch from torch import nn, tensor from torch.cuda import device_count x=torch.rand(5,3) print(x) tensor([[0.2815, 0.4555, 0.8501], [0.4315, 0.8502, 0.7026], [0.0792, 0.3319, 0.0031], [0.3803, 0.8361, 0.8805], [0.8667, 0.7345, 0.2963]]) print("available:",torch.cuda.is_available()," device_count:", ... torch.cuda.device_count()," current_device:", ... torch.cuda.current_device()) available: True device_count: 1 current_device: 0 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') x = torch.tensor([1, 2, 3], device=device) y = torch.tensor([1,4,9]).to(device) print(x,y) tensor([1, 2, 3], device='cuda:0') tensor([1, 4, 9], device='cuda:0') print(x+y) tensor([ 2, 6, 12], device='cuda:0')

ktagen-sudo commented 2 years ago

You are very welcome! :)