bowen-gao / DrugCLIP

[NeurIPS 2023] DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening
MIT License
66 stars 5 forks source link

Evaluation on DUD-E dataset #2

Closed nwod-edispu closed 6 months ago

nwod-edispu commented 10 months ago

Thanks for your great work!

I have a question about the evaluation procedure on the DUD-E dataset. The training was done using pocket and ligand data, but the DUD-E dataset only provides the whole protein structure (no pocket), which is inconsistent with the training data. As far as I know, a model trained on pockets may not perform well on the whole protein. How do you conduct the evaluation on the DUD-E dataset?

bowen-gao commented 10 months ago

Building upon previous studies such as Planet and docking software, we utilized the co-crystal ligand conformations from the DUD-E test set to extract the binding pocket. While acknowledging that these binding pockets may not represent the definitive binding sites for the active and inactive molecules we aim to screen, all virtual screening research operates under the hypothesis that binding to the same pocket as the co-crystal ligand is a prerequisite for eliciting biofunctions.

nwod-edispu commented 10 months ago

Thanks for your detailed explanation! I really appreciate it, and I understand now.