deepchem / deepchem

Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology
https://deepchem.io/
MIT License
5.57k stars 1.7k forks source link

Using optimized hyperparameters to build the best model #1598

Open sumuduleelananda opened 5 years ago

sumuduleelananda commented 5 years ago

I'm using GraphConvModel to train for 3 tasks. Since the results weren't very good I used GPGO hyperparameter optimization. I use deepchem version '2.1.1'.

These are the optimized hyperparameters obtained from gpgo.getResult():

(OrderedDict([('dense_layer_size', 173.0), ('graph_conv_depth', 1.0), ('learning_rate', 0.002189719495867233), ('graph_conv_layer_size', 50.0), ('dropout', 0.06547156784983114)]), 0.6260067005868888)

What is returned is valid_scores["mean-roc_auc_score"] (i.e. 0.6260067005868888 is the average ROC_AUC of the 3 tasks for the valid set).

I created a new model with these best parameters and refitted the training dataset like this:


mod = dc.models.GraphConvTensorGraph( len(tasks), batch_size=50, mode='classification', learning_rate=0.002189719495867233, dense_layer_size=173.0, graph_conv_depth=1.0, graph_conv_layer_size=50.0, dropout=0.06547156784983114
)

mod.fit(train_dataset)

plotted the ROC curves for the 3 tasks (valid set)

y_pred = mod.predict(valid_dataset) for i in range(len(tasks)): plot_roc(valid_dataset.y[:,i], y_pred[:,i,1], valid_dataset.w[:,i], title='Valid ROC: ' + tasks[i])

The problem is, I expected when I predict the values for the valid dataset I should get the same average ROC_AUC. But I don't. The average is now even worse than what I had before optimization. So I don't think the way I created the model with the new optimized parameters worked.

How do I use the optimized hyperparameters to get the exact optimized model?

Thank you!

vsomnath commented 5 years ago

Could you add the code you used for loading and hyper-parameter optimization?

sumuduleelananda commented 5 years ago

This is how the dataset is first loaded (has 3 tasks and smiles) filename = 'data.csv' featurizer = dc.feat.ConvMolFeaturizer() loader = dc.data.CSVLoader(tasks, smiles_field = 'smiles', featurizer = featurizer) dataset = loader.featurize(filename)

def hyper_model(dense_layer_size, graph_conv_depth, graph_conv_layer_size, *params): dense_layer_size = int(round(dense_layer_size)) graph_conv_depth = int(round(graph_conv_depth)) graph_conv_layer_size = int(round(graph_conv_layer_size)) model = dc.models.GraphConvModel( len(tasks), batch_size=50, mode='classification', graph_conv_layers=[graph_conv_layer_size]graph_conv_depth, dense_layer_size=dense_layer_size, **params ) model.fit(train_dataset) validscores, = model.evaluate(valid_dataset, [roc_auc_metric], per_task_metrics=True) return valid_scores["mean-roc_auc_score"]

cov = matern32() gp = GaussianProcess(cov) acq = Acquisition(mode='ExpectedImprovement') params_dict = { 'dense_layer_size': ('int', [32, 256]), 'graph_conv_layer_size': ('int', [16, 128]), 'graph_conv_depth': ('int', [1, 4]), 'dropout': ('cont', [.0, .2]), 'learning_rate': ('cont', [.0, .005]) }

gpgo = GPGO(gp, acq, hyper_model, params_dict) gpgo.run(max_iter=10) gpgo.getResult()

Thank you for taking a look at it. Please let me know if you need more information from me.

hjzhang1018 commented 3 years ago

Hi, @sumuduleelananda , I ran into the same situation recently. Are you able to solve this problem (How to use the optimized hyperparameters to get the exact optimized model) now? Thank you !