ketatam / DiffDock-PP

Implementation of DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models in PyTorch (ICLR 2023 - MLDD Workshop)
https://arxiv.org/abs/2304.03889
174 stars 35 forks source link

Offline run of DiffDock-pp #5

Open slieped opened 1 year ago

slieped commented 1 year ago

Consider to add utilities / modify code to work with offline computing resources. Most HPC do not have direct internet conection to internet. Thus, the use of torch.hub to download ESM model, might be problematic!

I came up with a simple solution that could be integrated (or at least mentioned in furhter examples)

0) Instal esm package via pip: pip install fair-esm (https://github.com/facebookresearch/esm)

1) Download model and regression .pt files: https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t33_650M_UR50D.pt https://dl.fbaipublicfiles.com/fair-esm/regression/esm2_t33_650M_UR50D-contact-regression.pt

2) Import esm function to load precomputed models: from esm.pretrained import load_model_and_alphabet_local

3) Modify data.train.utils.compute_emedding function [325-327]: modelpath = 'path/to/model/esm2_t33_650M_UR50D.pt' esm_model, alphabet = load_model_and_alphabet_local(modelpath)

I do know that with torch.hub its possible to pre-cache the files. And then just load the pre-downloaded ones, but its not ideal This is just a consideration, to make the tool more scalable and useful for other teams!

Victor M

ketatam commented 1 year ago

Hi!

Thanks a lot for the detailed suggestion. I will test your approach and then update the README.md accordingly.