This PR aims to refactor @t-kimber's scripts for ligand-based models into a library-consumable infrastructure.
Todos
Library:
[ ] Bring in featurizers:
[ ] Normalize Hash Featurizer for protein
[ ] Dominique's kinase fingerprint
[ ] Bring in models
[ ] Design Ligand and Protein helper objects for easy handling of raw data and their potential featurizations. Ligand object can subclass openforcefield.topology.Molecule, but I am not sure how we should deal with Protein...
[ ] Provide DatasetConsumers (or *Builder, *Handler, etc) base class and corresponding subclasses for KINOMEscan, ChEMBL, etc.
[ ] Add unit tests
[ ] Make sure docstrings are there and are accurate @t-kimber
Description
This PR aims to refactor @t-kimber's scripts for ligand-based models into a library-consumable infrastructure.
Todos
Library:
Ligand
andProtein
helper objects for easy handling of raw data and their potential featurizations.Ligand
object can subclassopenforcefield.topology.Molecule
, but I am not sure how we should deal withProtein
...DatasetConsumers
(or*Builder
,*Handler
, etc) base class and corresponding subclasses for KINOMEscan, ChEMBL, etc.Scripts/workflows:
Status
Ligand
Protein
Models