OpenSourceMalaria / Series4_PredictiveModel

Can we Predict Active Compounds in OSM Series 4?
7 stars 10 forks source link

holeung_ng_predictions #12

Closed holeung closed 5 years ago

holeung commented 5 years ago

Used the homology model I created in Round 1. Docked the top 5 most potent compounds into homology model to find common binding site and pose. Inhibitors bind in the same site, but a more consistent pose is roughly flipped from the one I found in Round 1. Energy minimize model bound to top inhibitor in this pose.

Used OpenEye Posit to dock all molecules into this orientation. For all molecules that are scored with activity values like <5, <20, etc., replace these values with 30, 90, 300, 900 uM. Calculate pIC50 from all IC50s.

Calculate 1D, 2D, 3D features using Mordred. Rescore poses with Autodock Vina and RF-Score v3 scores. Add these and the posit probability scores as features.

Use XGBoost regressor to build and fit model. Use 10-fold cross-validation to optimize hyperparameters. Apply to prediction set.

Carboranes crash most software. For the one carborane molecule in the prediction set, OSM-LO-1, just use the average of the IC50s from the two carboranes in the training set.