openkinome / kinoml

Structure-informed machine learning for kinase modeling
https://openkinome.org/kinoml/
MIT License
51 stars 21 forks source link

Ligand Graph Featurizer #38

Closed t-kimber closed 3 years ago

t-kimber commented 3 years ago

Description

Tie up loose ends for the graph ligand featurizer and more specifically, the atomic features.

By default, we will use the same ones as the PotentialNet model, https://doi.org/10.1021/acscentsci.8b00507.

Taken from the PotentialNet paper:

"Deep Neural Networks were constructed and trained with PyTorch.(52) Custom Python code was used based on RDKit(53) and OEChem(54) with frequent use of NumPy(55) and SciPy.(56) Networks were trained on chemical element, formal charge, hybridization, aromaticity, and the total numbers of bonds, hydrogens (total and implicit), and radical electrons. "

Todos

Questions

Status

Notes

For sake of completion, let's look at the features implemented in

  1. deepchem (see code) , and used in MoleculeNet:
    • one-hot encoded atomic symbol
    • one-hot encoded degree
    • one-hot encoded implicit valence
    • formal charge
    • number of radical electrons
    • one-hot encoded hybridization type
    • aromaticity
  2. Takayuki Serizawa et al., poster presentation at the RDKit UGM 2019:
    • one-hot atom type
    • one-hot degree
    • one-hot valence
    • formal charge
    • one-hot hybridization type
    • number of racial electrons
    • aromaticity
    • one-hot encoded number of hydrogen atoms
    • partial charge (not rkdit!)
t-kimber commented 3 years ago

The atomic features can be found here : https://github.com/openkinome/kinoml/blob/8d5a40152936a561a7bffc8ed7bdbeadde42d902/kinoml/features/ligand.py#L322