xopt-org / Xopt

Flexible high-level optimization in Python
Apache License 2.0
69 stars 23 forks source link

Inconsistency in the `Generator.generate()` returns #166

Open cr-xu opened 11 months ago

cr-xu commented 11 months ago

Now the the returned values from generate(n_candidates) for different Generators are in different conventions.

Specifically, some return List[Dict[str, float]] and some return List[Dict[str, np.ndarray]], which is not documented explicitly in the type annotation.

Although this doesn't affect the performance when people use only X.run(), it becomes quite confusing when someone is trying to use Generator separately.

Since most (if not all) of the Generators only works for the case that an input parameter has one scalar value, it's safe to use the consistent style of List[Dict[str, float]]?

For example:

from xopt import Xopt, Evaluator
from xopt.generators import UpperConfidenceBoundGenerator, ExtremumSeekingGenerator, CNSGAGenerator, RCDSGenerator
from xopt.resources.test_functions.rosenbrock import evaluate_rosenbrock, make_rosenbrock_vocs

vocs = make_rosenbrock_vocs(2)
evaluator = Evaluator(function=evaluate_rosenbrock)
ucb_generator = UpperConfidenceBoundGenerator(vocs=vocs)
x_ucb = Xopt(vocs=vocs, generator=ucb_generator, evaluator=evaluator)
x_ucb.random_evaluate(1)
print("BO Generator", x_ucb.generator.generate(1))

es_generator = ExtremumSeekingGenerator(vocs=vocs)
x_es = Xopt(vocs=vocs, generator=es_generator, evaluator=evaluator)
x_es.random_evaluate(1)
print("ES Generator", x_es.generator.generate(1))

csnga_generator = CNSGAGenerator(vocs=vocs)
x_csnga = Xopt(vocs=vocs, generator=csnga_generator, evaluator=evaluator)
x_csnga.random_evaluate(1)
print("CSNGA Generator", x_csnga.generator.generate(1))

rcds_generator = RCDSGenerator(vocs=vocs)
x_rcds = Xopt(vocs=vocs, generator=rcds_generator, evaluator=evaluator)
x_rcds.random_evaluate(1)
print("RCDS Generator", x_rcds.generator.generate(1))

Output:

BO Generator [{'x0': -2.0, 'x1': -2.0}]
ES Generator [{'x0': array([-0.74411679]), 'x1': array([-1.01830358])}]
CSNGA Generator [{'x0': -1.1693337549293048, 'x1': 1.7901024047699003}]
RCDS Generator [{'x0': array([0.]), 'x1': array([0.])}]
roussel-ryan commented 11 months ago

Thanks for bringing this up @cr-xu . Actually, most generators can generate more than one sample if generator.generate(x) if x>1 (I think ES and RCDS are some of the only generators that cannot generate more than one). I think we can standardize this such that the type hint is a numpy array instead. This however might conflict with our casual notion of what the evaluate function should take as an input ie. dict[str, float], since multiple points to be evaluated should have the form List[dict[str, float]]. @ChristopherMayes what are your thoughts?

Note that this is why I favor pandas DataFrame conventions instead of dicts, and lists of dicts. We currently do this internally inside of xopt, but it might be useful to have generators return DataFrames instead.

cr-xu commented 11 months ago

Thanks for bringing this up @cr-xu . Actually, most generators can generate more than one sample if generator.generate(x) if x>1 (I think ES and RCDS are some of the only generators that cannot generate more than one).

This I understand. I meant one input parameter might not be scalar-valued (list, categorical etc.), but this is not being considered by Xopt anyway.

An easy fix would be to float() convert the returned values from RCDS and ES, because they are np arrays now.

In general, I like the idea of generators returning pd.DataFrame.