HannesStark / EquiBind

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein
MIT License
469 stars 110 forks source link

Questions about RDKit conformer generation, fast point cloud conformer fitting, and EquiBind-R meaning. #44

Closed yufengwhy closed 1 year ago

yufengwhy commented 2 years ago
Code Paper
Input PDBBind ground truth coord after rotation and translation RDKit conformer
kabsch_rmsd_loss SVD 3.2.2closed form solution
  1. why above mismatch in code and paper?
  2. Where is the "3.2.2closed form solution" in the code? do this once or each epoch?
  3. what is the difference between EQUIBIND-U and EQUIBIND-R?
HannesStark commented 2 years ago

1.1 To clarify why there is no mismatch, here are the lines of code through which rdkit coords are used instead of the PDBBind structure:

Here during preprocessing: https://github.com/HannesStark/EquiBind/blob/bdc9c4c32d49681c670abf63bc11a9d4d2a8e090/commons/process_mols.py#L882-L883

https://github.com/HannesStark/EquiBind/blob/bdc9c4c32d49681c670abf63bc11a9d4d2a8e090/commons/process_mols.py#L820-L826

Here when retrieving a sample from the dataset: https://github.com/HannesStark/EquiBind/blob/bdc9c4c32d49681c670abf63bc11a9d4d2a8e090/datasets/pdbbind.py#L171-L174

1.2 and 2: SVD and Section 3.2.2 in the paper: I think this confusion arises because of the assumption that the 3.2.2 mechanism happens during training. However, we only perform the point cloud ligand fitting described in 3.2.2 during inference. The RDKit conformer is matched to the EquiBind-U conformer in that step.

  1. EquiBind-R treats the input ligand as a rigid structure and assumes that the structure of the ligand that it would take when bound to the receptor is known.
yufengwhy commented 2 years ago
  1. How to config to produce EquiBind-R and EquiBind-U? EquiBind-U is EquiBind w/o 3.2.2 when inference? EquiBind-R is to input the PDBBind conformer to EquiBind and only learn rotation and translation, w/o any regulizations?
  2. during inference 3.2.2 need perform once or multiple times? How many times?
  3. Is 3.2.2 implemented here?
  4. 3.2.2 is decomposed for each bond ij, so these bonds ij are independent of each other, is it right? why?
HannesStark commented 1 year ago
  1. EquiBind U indeed refers to running EquiBind without the point cloud fitting. You are correct that we do not apply any distance geometry regularization in EquiBind-R since it only performs rigid body transformations.
  2. The ligand fitting only is performed once.
  3. Yes, the ligand fitting is implemented here.
  4. In equation 1 it can be observed that the term we are maximizing is independent of other torsion angles.