aspuru-guzik-group / olympus

Olympus: a benchmarking framework for noisy optimization and experiment planning
https://aspuru-guzik-group.github.io/olympus/
MIT License
82 stars 22 forks source link

How to use Olympus with a custom objective function? #16

Open sgbaird opened 2 years ago

sgbaird commented 2 years ago

I see from the docs that there are facilities for making custom Dataset-s, custom Planner-s, and custom Emulator-s, but what I'm going for would be more like a custom Surface. Does Olympus directly support optimizing an externally supplied (black box) objective function? If so, could you provide a MWE? Alan suggested I use Olympus for self-driving-lab-demo. See this Twitter thread and Alan's replies [1] [2] for context.

sgbaird commented 2 years ago

Tried out the example notebook and starting playing around with the Planner class (function):

from olympus.planners import Planner
planner = Planner(kind='Gpyopt')
type(planner)
olympus.planners.planner_gpyopt.wrapper_gpyopt.Gpyopt
dir(planner)
> ['RECEIVED_VALUES',
>  'SUBMITTED_PARAMS',
>  '__abstractmethods__',
>  '__class__',
>  '__contains__',
>  '__delattr__',
>  '__dict__',
>  '__dir__',
>  '__doc__',
>  '__eq__',
>  '__format__',
>  '__ge__',
>  '__getattr__',
>  '__getattribute__',
>  '__getitem__',
>  '__gt__',
>  '__hash__',
>  '__init__',
>  '__init_subclass__',
>  '__iter__',
>  '__le__',
>  '__lt__',
>  '__module__',
>  '__ne__',
>  '__new__',
>  '__reduce__',
>  '__reduce_ex__',
>  '__repr__',
>  '__setattr__',
>  '__setitem__',
>  '__sizeof__',
>  '__str__',
>  '__subclasshook__',
>  '__weakref__',
>  '_abc_impl',
>  '_ask',
>  '_get_bo_instance',
>  '_params',
>  '_project_into_domain',
>  '_set_param_space',
>  '_tell',
>  '_validate',
>  '_validate_paramvector',
>  '_values',
>  'acquisition_type',
>  'add',
>  'ask',
>  'attrs',
>  'batch_size',
>  'config',
>  'defaults',
>  'exact_eval',
>  'flip_measurements',
>  'from_dict',
>  'from_json',
>  'get',
>  'goal',
>  'indent',
>  'kind',
>  'max_prop_len',
>  'me',
>  'model_type',
>  'num_generated',
>  'optimize',
>  'param_space',
>  'props',
>  'recommend',
>  'reset',
>  'set_param_space',
>  'tell',
>  'to_dict',
>  'to_json',
>  'update']

I'm guessing the workflow will be something like planner.set_param_space(...) followed by planner.optimize(...) and where the parameter space is defined as described in custom emulators

Related: custom planners

rileyhickman commented 1 year ago

Hello,

Yes, you can use an arbitrary objective function with the Olympus machinery. My suggestion would be to see this example which runs a simulated cross coupling reaction optimization with the help of the Summit package.

#!/usr/bin/env python

import os, sys
import pickle
import time
import numpy as np
import pandas as pd
import subprocess

import olympus
from olympus.campaigns import ParameterSpace
from olympus.objects import ParameterContinuous, ParameterCategorical
from olympus.campaigns import Campaign
from olympus.planners import Planner

from summit.benchmarks import MIT_case1
from summit.strategies import LHS
from summit.utils.dataset import DataSet

#---------------
# Configuration
#---------------

model = 'Gpyopt'

BUDGET = 20
NUM_RUNS = 20
GOAL = 'maximize'

#-----------------------
# build olympus objects
#-----------------------

param_space = ParameterSpace()

# add ligand
param_space.add(
    ParameterCategorical(
        name='cat_index',
        options=[str(i) for i in range(8)],
        descriptors=[None for i in range(8)],        # add descriptors later
    )
)
# add temperature
param_space.add(
    ParameterContinuous(
        name='temperature',
        low=30.,
        high=110.
    )
)
# add residence time
param_space.add(
    ParameterContinuous(
        name='t',
        low=60.,
        high=600.
    )
)
# add catalyst loading
# summit expects this to be in nM
param_space.add(
    ParameterContinuous(
        name='conc_cat',
        low=0.835/1000,
        high=4.175/1000,
    )
)

#-------------------
# begin experiments
#-------------------

run_ix = 0

# loop through the independent runs
while run_ix < NUM_RUNS:

    print(f'\n\n STARTING SEED RUN {run_ix} on SUZUKI RXN CASE {TARGET_IX} using {model_kind}\n\n')

    # build campaign and set param space
    campaign = Campaign()
    campaign.set_param_space(param_space)

    # Olympus planner
    planner = Planner(kind=model_kind, goal=GOAL)

    # set planner param space
    planner.set_param_space(campaign.param_space)

    #-----------------
    # Begin experiment
    #-----------------

    iteration = 0

    while len(campaign.values) < BUDGET:
        print(f'\nITERATION : {iteration}\n')
        # instantiate summit object for evaluation
        exp_pt = MIT_case1(noise_level=1)
        samples = planner.recommend(campaign.observations)
        print(f'SAMPLES : {samples}')
        for sample in samples:
            # turn into dataframe which summit evaluator expects
            columns = ['conc_cat', 't', 'cat_index', 'temperature']
            values= {
                ('conc_cat', 'DATA') : sample['conc_cat'],
                ('t', 'DATA'): sample['t'],
                ('cat_index', 'DATA'): sample['cat_index'],
                ('temperature', 'DATA'): sample['temperature'],
            }
            conditions = DataSet([values], columns=columns)

            exp_pt.run_experiments(conditions)

            measurement = exp_pt.data['y'].values[0]

            campaign.add_observation(sample, measurement)

            to_disk = {
                'params': campaign.params,
                'values': campaign.values,
            }
            pickle.dump(to_disk, open(f'runs/run_{model_kind}_{run_ix}.pkl', 'wb'))

        iteration += 1

    run_ix += 1
sgbaird commented 1 year ago

Thanks for sharing the great example. Do you mind tweaking the example to get it to run to completion? https://colab.research.google.com/drive/1K9jgu58AtKCG3zro2hMZH9TQ7Y8PNGZ5?usp=sharing (editor link)

After that, adapting it should be straightforward (if you don't have time for that, no worries, and I can work with the example as-is). As a note-to-self, the param space would look like this:

param_space = ParameterSpace()
[param_space.add(ParameterContinuous(name=name, low=0.0, high=89.0)) for name in ["R", "G", "B"]]

(89 since max brightness=255 on the NeoPixel is painful to look at).

And given the planner.recommend syntax (nice btw), evaluating the objective function should be really straightforward.

rileyhickman commented 1 year ago

You need to use the dev branch of the repo to access the categorical parameters. If you can't get that to work I would just spin up a continuous-valued parameter example using this same ask-tell framework. It looks like you are dealing with a fully continuous parameter space anyways so there should be no reason to need them.