aspuru-guzik-group / olympus

Olympus: a benchmarking framework for noisy optimization and experiment planning
https://aspuru-guzik-group.github.io/olympus/
MIT License
82 stars 22 forks source link

Plans for multi-objective optimization benchmarks? (`target_ids`) #28

Open sgbaird opened 1 year ago

sgbaird commented 1 year ago

I'm noticing the following convention for Dataset:

https://github.com/aspuru-guzik-group/olympus/blob/bbeb991c4c1929474a0b1a4435a5eed566d30542/src/olympus/datasets/dataset.py#L42

(i.e., plural target_ids)

When trying to pass multiple list entries to target_ids, I get:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[44], line 1
----> 1 emulator.train()

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\emulators\emulator.py:354, in Emulator.train(self, plot, retrain)
    347 # Train
    348 Logger.log(
    349     ">>> Training model on {0:.0%} of the dataset, testing on {1:.0%}...".format(
    350         (1 - self.dataset.test_frac), self.dataset.test_frac
    351     ),
    352     "INFO",
    353 )
--> 354 mdl_train_r2, mdl_test_r2, mdl_train_rmsd, mdl_test_rmsd = self.model.train(
    355     train_features=train_features_scaled,
    356     train_targets=train_targets_scaled,
    357     valid_features=test_features_scaled,
    358     valid_targets=test_targets_scaled,
    359     model_path=model_path,
    360     plot=plot,
    361 )
    363 # write file to indicate training is complete and add R2 in there
    364 with open(f"{model_path}/training_completed.info", "w") as content:

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\wrapper_tensorflow_model\wrapper_tensorflow_model.py:159, in WrapperTensorflowModel.train(self, train_features, train_targets, valid_features, valid_targets, model_path, plot)
    154 losses.append(loss)
    156 if epoch % self.pred_int == 0:
    157 
    158     # make a prediction on the validation set
--> 159     valid_pred = self.predict(
    160         features=valid_features[valid_indices], num_samples=10
    161     )
    162     valid_r2 = r2_score(valid_targets[valid_indices], valid_pred)
    163     valid_rmsd = np.sqrt(
    164         mean_squared_error(valid_targets[valid_indices], valid_pred)
    165     )

File c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\olympus\models\wrapper_tensorflow_model\wrapper_tensorflow_model.py:282, in WrapperTensorflowModel.predict(self, features, num_samples)
    278     for _ in range(num_samples):
    279         predic = self.sess.run(
    280             self.y_pred, feed_dict={self.tf_x: X_test_batch}
    281         )
--> 282         pred[_, start:stop] = predic[:size]
    284 pred = np.mean(pred, axis=0)
    285 return pred

ValueError: could not broadcast input array from shape (50,8) into shape (50,1)
sgbaird commented 1 year ago

Just noticed the new manuscript. I'm guessing the changes mentioned in the manuscript will be incorporated here soon https://chemrxiv.org/engage/chemrxiv/article-details/6464ae0afb40f6b3eebaab70.