Closed avilella closed 3 years ago
Hi @avilella
Can you try to reboot the system and then try running the command again? If that does not work, maybe try uninstalling, reboot, reinstalling CUDA and cuDNN and reboot and then run the AF2. Sometimes some packages might not have installed properly.
This is the error TensorFlow is raising when you are trying to run it:
E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
This could be due to several factors so maybe trying to reinstall the required packages might fix the issue.
Here is what I tried:
I did a sudo apt remove
of the nvidia and cuda related drivers.
Then rebooted.
Then installed sudo apt install
the nvidia driver nvidia-drivers-460
, which is actually not the recommended by the command ubuntu-drivers devices
, which in my case was nvidia-drivers-470
.
I re-ran the run_alphafold script, and this time it recognised the GPU (it only says it can't find the TPU, which there isn't one).
It then goes on to the HHsearch step, which takes a few minutes, then it failed for a later step at the prediction step. Googling for the error, it seems it now needed update cuda drivers. I reinstalled sudo apt install nvidia-cuda-toolkit
, rebooted, and tried again.
After that, it seems to work, and I can see nvidia-smi taking up memory as the predict step goes on.
Thanks for the advice!
I am trying to run the non-docker version of alphafold2 in this repo: I succeeded in doing so in an AWS GPU instance with a GPU of 16Gb of RAM, and for the proteins I am inputting, it peaks at around 3Gb of RAM utilisation, by looking at
nvidia-smi
while alphafold2 is running.I am now trying the same with a laptop that has an Nvidia GPU with 4Gb of RAM (see info below), but so far, I am unable to make the same run_alphafold command to see the GPU. Any ideas?: