Closed DrCyanide closed 3 months ago
Is pytorch installed correctly and did you do pip install -r requirements.txt
and pip install -r requirements_demo.txt
?
I would recommend creating a fresh venv, installing the requirements there using the above pip commands and then retrying gradio_app.
I did both pip installs. PyTorch 2.2.0+cu121 is installed.
I'll try a virtual environment, see if that fixes it.
Setting up a venv and following the instructions. That got my PyTorch up to 2.4.0+cu124 (I don't think I'd tried to update pytorch for this project, so it's possible there was a miss match between pytorch and CUDA)
After that I got an OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Indeed, my CUDA_HOME
seems to be empty - both inside the venv and on my system. This should mean I'm past the ninja error, and into new territory. I'm seeing posts that it should be something like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4
but I don't have a NVIDIA GPU Computing Toolkit
folder. I've been checking what version of CUDA I have by using the command line nvidia-smi
command. It might be worth noting that nvcc --version
doesn't work on my system.
Any ideas on how to find my actual Cuda path, so I can manually set that?
Hmm. CUDA_HOME should have been automatically set for you when you installed the CUDA toolkit. Maybe try reinstalling CUDA?
Reinstalled CUDA 12.4 (to make sure it'd be consistent with everything that was already installed and working), and I've got mixed results.
On my system (outside of the venv) I now have a CUDA_PATH
that's populated (seems to be used interchangeably with CUDA_HOME
). nvcc --version
now works, and says it's CUDA 12.4. Trying to run python gradio_app.py
I get a new runtime error:
RuntimeError: D:\a\_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:891 onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded. Please install the correct version of CUDA andcuDNN as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
I have onnxruntime 1.18.1, the table at the URL says I should be using CUDA 12.x and cuDNN 9.x - which I am. Trying where cudnn*
returns that it couldn't find it. I tried re-installing cuDNN and opening a new terminal, but still nothing.
Inside the venv both CUDA_PATH
and CUDA_HOME
are empty. I tried to delete the venv and create a new one, but I still have they're still empty. The error there is back to the Ninja is required to load C++ extensions.
I feel like I'm going insane.
I tried uninstalling onnx, onnxruntime, and onnxruntime-gpu, (since they seemed to be having issues finding the path) then re-installing everything with the pip install -r requirements.txt
from my system, and now I'm back to RuntimeError: Ninja is required to load C++ extensions
It's a bit of a weird issue unfortunately. Typically installing pytorch itself should have automatically installed ninja for you.
When you install ninja ninja --version
should work. If it doesn't, find ninja
inside in C:\Users\Username\AppData\Roaming\Python\Python310\Scripts
and add that folder in PATH
, then try again. If you can't find it in there, it means pip install Ninja
is installing Ninja in a different directory altogether. The only solution I could think of in that case is to manually download the Ninja exe and add that in your PATH - https://github.com/ninja-build/ninja/releases
It's a bit of a weird issue unfortunately. Typically installing pytorch itself should have automatically installed ninja for you. When you install ninja
ninja --version
should work. If it doesn't, find it inside the bin folder inC:\Users\Username\AppData\Roaming\Python\Python310
and add that folder inPATH
, then try again. If you can't find it in there, it meanspip install Ninja
is installing Ninja in a different directory altogether.
OK, that fixed it! When I was checking if ninja was installed, I was doing it from inside the Python interactive console, importing it and checking the version. It never occurred to me that it might be trying to run it as it's own separate app via command line.
Just to be clear, if you get a OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
and your CUDA_HOME
variable is set (you can check with echo %CUDA_HOME%
on Windows), then you can try re-installing PyTorch to fix it. Pip install the version that matches what CUDA version you have installed. You can check CUDA version with nvcc --version
.
I've got Ninja installed (via
pip install Ninja
) and I can import it in Python, but I'm still getting an error that Ninja is required. There are two onnxruntime errors before that which I'm stumped by.Onnxruntime 1.18.1 Ninja 1.11.1.1 CUDA 12.4 cuDNN 9.3
Running on an RTX 3070 - which should be new enough there's no issues.