Pytorch-CUDA version compatibility problem in spyglass-position environment

sytseng commented 8 months ago

The current Frank lab GPU servers use CUDA 11.6, but the current environment_position.yml specifies the following dependencies:

pytorch<1.12.0
torchvision
torchaudio
cudatoolkit=11.3

which leads to a pytorch version 1.7.1.post2 which does not recognize any GPUs on the lab server due to cuda incompatibility (probably because the cudatoolkit is specified to version 11.3)

Bug behavior: torch.cuda.is_available() returns False

torch.cuda.current_device() returns the following error AssertionError: Torch not compiled with CUDA enabled

I have not started using the DLC pipeline so I don't know the impact of this issue. Other people seem to be using the GPUs on lab server without any issues currently, but in the future there may be a need to update the environment_position.yml or make notes about installing the correct pytorch version.

samuelbray32 commented 6 months ago

Tried building without pinning cudatoolkit and this persists

edeno commented 6 months ago

What version of cudatoolkit did it end up using?

And just so I understand, this is on zephyr and breeze, but you've tested on your local machine and everything has been fine?

LorenFrankLab / spyglass

Pytorch-CUDA version compatibility problem in spyglass-position environment #710