Closed iamlll closed 2 months ago
I had similar problems before, tried the following procedures, and fixed the problem.
nvidia-smi
to find the highest CUDA version supported by your GPU and current GPU driver. It will display something like:
| NVIDIA-SMI 532.09 Driver Version: 532.09 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1660 Ti WDDM | 00000000:01:00.0 On | N/A |
| N/A 57C P8 9W / N/A| 350MiB / 6144MiB | 3% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
For newer GPU and up-to-date drivers, this is not a problem, but it might be an issue for older GPUs that are not supporting new CUDA versions.
conda list
. You should have something like these:
cuda-cccl 12.4.127 0 nvidia
cuda-cudart 11.8.89 0 nvidia
cuda-cudart-dev 11.8.89 0 nvidia
cuda-cupti 11.8.87 0 nvidia
cuda-libraries 11.8.0 0 nvidia
cuda-libraries-dev 11.8.0 0 nvidia
cuda-nvrtc 11.8.89 0 nvidia
cuda-nvrtc-dev 11.8.89 0 nvidia
cuda-nvtx 11.8.86 0 nvidia
cuda-profiler-api 12.4.127 0 nvidia
cuda-runtime 11.8.0 0 nvidia
pytorch 2.2.2 py3.8_cuda11.8_cudnn8_0 pytorch
pytorch-cuda 11.8 h24eeafa_5 pytorch
The CUDA packages version and pytorch-cuda version should not be higher than the CUDA version from nvidia-smi
.
Check the version of CUDA toolkit installed, use nvcc --version
. You should have outputs like below to indicate the CUDA compiler driver version, which should align with the pytorch-cuda
version before. I've seen people reporting problems when the nvcc
and pytorch-cuda
version don't align:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
If 'nvcc' is not recognized as an internal or external command, operable program or batch file.
then manually install/reinstall the right version of CUDA toolkit from CUDA archive (https://developer.nvidia.com/cuda-toolkit-archive) which includes the Nvidia CUDA Compiler (NVCC).
After all the steps before are checked, try the check method from the pytorch website using:
import torch
torch.cuda.is_available()
or
python -c "import torch; print(torch.cuda.is_available())"
If the return is True
then that means cellpose can recognize the GPU and then in the software check the box to "use GPU".
In my experience, the "use GPU" checkbox is always gray and not usable if the torch.cuda.is_available()
returns False
.
When you run the analysis with cellpose, you can get the output of GPU running like this:
2024-04-16 17:06:43,472 [INFO] ** TORCH CUDA version installed and working. **
2024-04-16 17:06:43,472 [INFO] >>>> using GPU
Good luck!
Thank you so much, this is so helpful and finally fixed my problem! I'll outline what I did in case it may be helpful to other users:
nvidia-smi
outputted an error message that said it could not establish communicate with the driver I had been using, so I uninstalled everything related to nvidia and cuda with
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^libnvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
nvidia-smi
worked again.environment.yml
to install pytorch. In previous attempts I've had some issues with conda being able to solve my environment when I try to install other packages after cellpose, so I just added all the other packages I knew I needed to the same .yml
file before creating the conda environment./usr/local/cuda-11.8/lib64
to LD_LIBRARY_PATH
variable and /usr/local/cuda-11.8/bin
to PATH
variableconda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
for GPU torch did not work for me (torch would not import properly, probably since some of the cuda packages installed via this command were cuda version 12 instead of 11), so I had to use the pip installation pip3 install torch --index-url https://download.pytorch.org/whl/cu118
After that, torch.cuda.is_available() == True
!
Install problem I have tried installing cellpose and torch (GPU version) in several conda environments (deleting the original environment before trying again) without issue, following the README installation instructions. However, when I try running cellpose in a script it cannot detect / interface with my computer's GPU. Here is the setup of my latest attempt:
First, create an environment with
conda env create -f environment.yml
Next, uninstall torch:pip uninstall torch
Install torch withpip3 install torch --index-url https://download.pytorch.org/whl/cu118
(I've also tried using conda commands to install pytorch and installing older versions of cuda. The end result is always the same, i.e. that my torch version is not installed properly)Environment info pkglist.txt I am using an Nvidia GeForce GTX Titan X. Package list when using the command
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
to install pytorch instead of torch: pkglist_pytorch.txtRun log Here's the test code I was using to check whether cellpose could access my GPU:
This code snippet returns the following output: