Bismuth-Consultancy-BV / MLOPs

Machine Learning Toolset for Houdini
BSD 3-Clause "New" or "Revised" License
365 stars 53 forks source link

Unable to use MLOPs with CPU and "Torch not compiled with CUDA enabled" #35

Closed usama-ghufran closed 1 year ago

usama-ghufran commented 1 year ago

Tldr: I cannot use CUDA or CPU with MLOPs

I never had pyTorch installed but I keep getting CUDA errors

AssertionError: Torch not compiled with CUDA enabled I've removed all my anaconda installations and installed the latest Houdini 19.5.569 CUDA 12.1

Reinstalled dependencies but yet in Houdini Python Shell:

>>> import torch
>>> torch.cuda.is_available()
False

So I decided to just work with CPU mode. But I get

houdini19.5/scripts/python\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

Which stackoverflow tells me is because:

The error was throwing because the data type of operands was float16. Changing it back to float32 solved the problem. I guess float16 is for GPU implementation only.

So MLOPs is expecting a GPU solver, not CPU and I cannot use a GPU because CUDA cannot be found.

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
usama-ghufran commented 1 year ago

okay so I installed CUDA 11.7 and removed 12.1

>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_19:00:59_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

Reinstalled the dependencies Restarted Houdini

Still no luck :(

usama-ghufran commented 1 year ago

After installed the correct version of CUDA (11.7), I wasn't uninstalling torch. I was just deleting the folders in the Houdini/scripts/python folder.

Installing torch again didn't compile it with CUDA as there was already torch CPU version present.

I uninstalled torch and installed again and it works!