Closed adi1bioinfo closed 2 years ago
Most of the time is taken by running MSA tools -- see "features": 26155.667644023895,
-- i.e. 7.2 hours is spent in the data pipeline before running the model. This is normal and expected, you could make it perhaps a bit faster by using a faster SSD and/or compiling HHBlits/Jackhmmer with march=native
.
We then run 5 models sequentially, each taking about 1700 seconds, but that also includes building time for the model, which is substantial.
The numbers reported in the paper are purely for the prediction part for a single model, which seems consistent to me with the numbers you are seeing.
Thank you for clarifying this for me.
I have installed alphafold2 using the non-docker method (https://github.com/kalininalab/alphafold_non_docker) on HPC (i don't have root privileges), I am running the script using GPU (V100 with 16 GB of memory). For a sequence of around 200 amino acids, It is taking around 8-10 hours for structure determination. In Alphafold2 paper (https://www.nature.com/articles/s41586-021-03819-2.pdf), it is quoted that "Representative timings for the neural network using a single model on V100 GPU are 4.8 min with 256 residues, 9.2 min with 384 residues and 18 h at 2,500 residues". I have pasted the output logs below. Any idea why it is running very slow in my case although I am using the same GPU?
Components of out file generated by HPC:
timings.json file output
Thank you, Aditi