MIC-DKFZ / nnUNet

Apache License 2.0
5.85k stars 1.75k forks source link

Installing nnUNet overwrites current torch version which causes compatability issues with torchaudio/torchvision/cuda #2386

Open richard-ch opened 3 months ago

richard-ch commented 3 months ago

Hi,

Apologies if this is a repeated/common question.

I am trying to install nnUNet in my environment through SSH onto my university's HPC. Which means I don't have the ability to upgrade cuda or nvidia drivers compatible for the latest pytorch version, so I have to install older versions.

Currently, the HPC has cuda 11.2 loaded. Which is the latest cuda version available on the HPC. (Other versions are older, e.g. 10.2, 10.0) I followed the instructions for installing pytorch from the website using the following command pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html.

After installing, I tested the installation with the python -c 'import torch;print(torch.cuda.is_available())' which returned True.

Then I cloned the repo following installation instruction, cd into nnunet and used pip install -e .. During installation, nnunet uninstalled torch-1.9.1+cu111 and installed torch-2.3.1. Attached is a screenshot of the terminal after installation.

image

Running python -c 'import torch;print(torch.cuda.is_available())' now returns the following error.

image

I can see from pyproject.toml under dependencies that specifies for "torch>=2.1.2",, which is probably why my torch version was removed and updated.

Would nnunet still work normally if I were to remove that line and pip install -e . and use torch==1.91 (with corresponding torchvision/torchaudio)? Or are there other ways to build the new nnunet without having to update torch/cuda/nvidia driver version?

Here is the output from nvcc -V (with cuda 11.2 loaded)\ nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Sun_Feb_14_21:12:58_PST_2021 Cuda compilation tools, release 11.2, V11.2.152 Build cuda_11.2.r11.2/compiler.29618528_0 Here is also the output for nvidia-smi

image

Thank you for your time and I am happy to provide further information.

Cheers,

ancestor-mithril commented 3 months ago

nnUNet uses several features from torch>=2.0.0, such as torch.compile. If you remove all lines containing torch.compile and torch._dynamo, nnUNet should work with older versions of pytorch.

richard-ch commented 3 months ago

Hi,

Thanks for your reply.

After commenting out lines which used torch.compile and torch._dynamo (including OptimizedModule function), and also editing the dependencies from pyproject.toml. The terminal now also says batchgeneratorsv2 0.1.1 depends on torch>=2.0.0. Should I follow the same process as torch.compile & torch._dynamo? I checked torch 2.0.0 on pytorch's website and the lowest compatible CUDA version is 11.7 which is so close to the highest available cuda version on one of the HPCs (11.6). But I believe torch 2.0.0 won't work even if I switch to cuda 11.6?

Cheers for the help!

ancestor-mithril commented 3 months ago

Batchgenerators should work with older versions of pytorch without any change.