deepmodeling / Uni-Mol

Official Repository for the Uni-Mol Series Methods
MIT License
724 stars 126 forks source link

Longer than expected running time for Binding Pose Prediction #186

Open tsa87 opened 1 year ago

tsa87 commented 1 year ago

In the Uni-Mol paper, the average number of seconds per ligand is 0.2.

Efficiency benchmark We compare Uni-Mol binding pose prediction with popular docking tools in efficiency. The baseline results are taken from EquiBind [99] paper. And Uni-Mol binding pose prediction is run on a single V100 GPU. For each molecule, Uni-Mol is run with 10 different initial conformations, and the total time cost is reported. As shown in Table 20, Uni-Mol is significantly faster than traditional docking tools, about 250x faster.

image

I tried running the docking pose prediction on the provided test.lmdb and the run time took much longer than expected on a single RTX 3090. The average time was about 3 seconds per ligand. Were anything done to improve the run time of this?

This step was relatively fast ~36 seconds:

data_path="./protein_ligand_binding_pose_prediction"  # replace to your data path
results_path="./infer_pose"  # replace to your results path
weight_path="./save_pose/checkpoint.pt"
batch_size=8
dist_threshold=8.0
recycling=3

python ./unimol/infer.py --user-dir ./unimol $data_path --valid-subset test \
       --results-path $results_path \
       --num-workers 8 --ddp-backend=c10d --batch-size $batch_size \
       --task docking_pose --loss docking_pose --arch docking_pose \
       --path $weight_path \
       --fp16 --fp16-init-scale 4 --fp16-scale-window 256 \
       --dist-threshold $dist_threshold --recycling $recycling \
       --log-interval 50 --log-format simple

But this step took around 15 mins for 285 ligand/pocket pairs.

`nthreads=20`  # Num of threads
predict_file="./infer_pose/save_pose_test.out.pkl"  # Your inference file dir
reference_file="./protein_ligand_binding_pose_prediction/test.lmdb"  # Your reference file dir
output_path="./protein_ligand_binding_pose_prediction"  # Docking results path

python ./unimol/utils/docking.py --nthreads $nthreads --predict-file $predict_file --reference-file $reference_file --output-path $output_path
ZhouGengmo commented 12 months ago

The time in the table refers to the inference time in binding pose prediction task

tsa87 commented 12 months ago

Does the binding pose prediction task include the time for docking.py? I believe calling docking.py is required to compute the coordinates for the atoms based on the predicted intermolecular distance.

ZhouGengmo commented 12 months ago

This time is just the time for model inference in this task.