I am installing nnUNet on a Docker container with CUDA driver 11.4. I first install the most recent torch compatible with the driver and it runs ok (11.8 should be compatible with 11.4):
When I instal nnUNet it removes torch 2.0.1 to install 2.4 and creates incompatibility with other libraries:
Attempting uninstall: triton
Found existing installation: triton 2.0.0
Uninstalling triton-2.0.0:
Successfully uninstalled triton-2.0.0
Attempting uninstall: torch
Found existing installation: torch 2.0.1+cu118
Uninstalling torch-2.0.1+cu118:
Successfully uninstalled torch-2.0.1+cu118
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.4.1 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.4.1 which is incompatible.
torchtext 0.15.2+cpu requires torch==2.0.1, but you have torch 2.4.1 which is incompatible.
torchvision 0.15.2+cu118 requires torch==2.0.1, but you have torch 2.4.1 which is incompatible.
Successfully installed argparse-1.4.0 nnunetv2-2.5.1 torch-2.4.1 triton-3.0.0
And when I train my model I get the following error:
File "/opt/conda/lib/python3.9/site-packages/torch/cuda/__init__.py", line 314, in _lazy_init
torch._C._cuda_init()
RuntimeError: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
The same happens if I install an older version of torch (compatible with CUDA 11.3).
Please help!
I am installing nnUNet on a Docker container with CUDA driver 11.4. I first install the most recent torch compatible with the driver and it runs ok (11.8 should be compatible with 11.4):
When I instal nnUNet it removes torch 2.0.1 to install 2.4 and creates incompatibility with other libraries:
And when I train my model I get the following error:
The same happens if I install an older version of torch (compatible with CUDA 11.3). Please help!