openkinome / kinoml

Structure-informed machine learning for kinase modeling
https://openkinome.org/kinoml/
MIT License
51 stars 21 forks source link

Rework object model #107

Closed schallerdavid closed 2 years ago

schallerdavid commented 2 years ago

Description

The repository lost a bit of its uniform object model structure. This PR aims to clean the repository from unused code, to define the object model and where methods should be placed (ideally this ends up in a notebook), and finally to apply this recommendation to all existing code.

Todos

Notes

The refactoring also requires touching the dataset providers.

PKIS2 and ChEMBL data providers now allow specifying the protein object to use, i.e. Protein or KLIFSKinase.

I noticed there are 5 compounds with two measurements in the PKIS2 dataset. I would keep both for now:

There are kinase constructs with unclear sequences in the PKIS2 dataset, i.e. with "Null" AA Start/Stop but "Partial Length" construct length. I would consider them as wild type full sequence length constructs for now.

Status

codecov-commenter commented 2 years ago

Codecov Report

Merging #107 (313e6f1) into master (2d0e6ca) will increase coverage by 1.80%. The diff coverage is 82.75%.

schallerdavid commented 2 years ago

Hey @jchodera,

in case you find some spare time it would be great if you could have a look at the new Ligand object.

The main idea was to make it more simple and let the openff toolkit do the most work. In the new implementation, the Ligand object has a molecule attribute which is an openff molecule. This way we have access to all openff functionality. The molecule.getter will always check if a smiles was defined and return an openff molecule based on this smiles, which allows for lazy instantiation.

Cheers, David

review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

schallerdavid commented 2 years ago

also addresses #66