Closed nickrrose closed 9 months ago
Hi @nickrrose ,
Are you trying to install the dependencies for TCRmodel using AlphaFold's docker image? If you are trying to set up the environment, it possible to not use Docker. You can install each packages individually using conda, pip, wget (see Option 2: Step-by-Step Installation in the readme file). That way you would be able to have more flexibility, and be able to work with a different cuda version.
Best, Rui
Hello! Unfortunately, I am running TCRmodel2 on nodes that can only be accessed through slurm (the only nodes on my remote cluster with GPU) so I don't think downloading the dependancies in a local conda environment will work (unless I am mistaken). As of now, I am currently just trying modify the alphafold docker to run on CUDA 12.1 (which is what is known to work on our nodes) but I am still having a bit of trouble. Thanks for the help though!
If relevant: my specific issue is with the cuda and cudnn libraries not being accessed correctly Could not load library libcudnn_ops_infer.so.8. Error: libcublas.so.11: cannot open shared object file: No such file or directory
Understood. Based on what you described, the simplest approach would be to reach out to your cluster administrator for assistance with the CUDA version. They have direct access to the cluster's configuration and are best equipped to offer efficient help in addressing the specific library issues you're encountering. I'm going to close the ticket for now, but please feel free to reopen if you have any additional questions!
I have been trying to do exactly that for a while now, I have limited docker experience, but the issue I keep getting is the following: Could not load library libcudnn_ops_infer.so.8. Error: libcublas.so.11: cannot open shared object file: No such file or directory
I edited the alphafold docker to use 11.2, but I'm not sure if I need to have our administrator download cuda 11.2 to make this work.