microsoft / Graphormer

Graphormer is a general-purpose deep learning backbone for molecular modeling.
MIT License
2.15k stars 337 forks source link

Distributional Graphormer protein-ligand docking data not on Hugging Face #201

Open amelie-iska opened 2 months ago

amelie-iska commented 2 months ago

Hi, it appears the data for the protein-ligand docking is not on Hugging Face. Please advise on how to obtain this.

https://huggingface.co/microsoft/Graphormer/tree/main/Distributional-Graphormer/protein-ligand

VladyslavDoc commented 2 months ago

Hi! I have the similar question. @microsoftopensource, could you help with issue?

yfukasawa commented 2 months ago

quite timely! We've faced the same situation. Only the checkpoint file is available for the protein-ligand.

wbren commented 1 month ago

Also the same issue! Could anyone help with this, please? @zhengsx @shiyu1994 @microsoftopensource @guolinke @volltin

yfukasawa commented 1 week ago

I wonder if anyone has made any progress on the protein ligand aspect of this project? It seems that even the sample data is not available on Hugging Face.

We took the liberty of checking the codes to see if it is possible to generate results for samples or some other targets. It seems that the core of the protein-ligand process is train_cli.sh, which uses Fairseq. It appears that this process loads binarized data (possibly after some coordinate encoding) for both the protein and the ligand as a data set. We haven't yet been able to determine how the protein-ligand module handles encoding from the pdb/sdf format. As a result, it appears that the protein-ligand module is not yet ready for third-party use.

I hope we can get some feedback from the development team.