bayesian-optimization / BayesianOptimization

A Python implementation of global optimization with gaussian processes.
https://bayesian-optimization.github.io/BayesianOptimization/index.html
MIT License
7.95k stars 1.55k forks source link

optimizer.max['target'] returns the smallest negative value? #514

Closed Gabriel-p closed 3 months ago

Gabriel-p commented 3 months ago

Describe the bug Using optimizer.max['target'] with the code below results in:

cd: 350993.450
fd: 10451.569
rc: 0.511
rt: 0.605
Lkl: -1592252620.6647666

where Lkl is the smallest negative value found for what I can tell.

To Reproduce A concise, self-contained code snippet that reproduces the bug you would like to report.

Ex:

import numpy as np
from bayes_opt import BayesianOptimization

x = np.array([
    0.00869858, 0.01304788, 0.02174646, 0.03044504, 0.03914363,
    0.04784221, 0.05654079, 0.06523938, 0.07393796, 0.08263655,
    0.09133513, 0.10003371, 0.1087323 , 0.11743088, 0.12612946,
    0.13482805, 0.14352663, 0.15222522, 0.1609238 , 0.16962238,
    0.17832097, 0.18701955, 0.19571813, 0.20441672, 0.2131153 ,
    0.22181388, 0.23051247, 0.23921105, 0.24790964, 0.25660822,
    0.2653068 , 0.27400539, 0.28270397, 0.29140255, 0.30010114,
    0.30879972, 0.31749831, 0.32619689, 0.33489547, 0.34359406,
    0.35229264, 0.36099122, 0.36968981, 0.37838839, 0.38708698,
    0.39578556, 0.40448414, 0.41318273, 0.42188131, 0.43057989
])
y = np.array([
    54688.54691276, 56090.81734642, 49640.37335158, 54688.54691276,
    50481.73561177, 59660.23299573, 54364.94604345, 54408.09282602,
    50729.19510007, 52695.84682282, 55089.19560809, 52859.49852102,
    49472.10089954, 43470.38344347, 40907.61334058, 41525.29864839,
    37478.86431783, 34255.46345085, 37406.51129791, 34948.89388507,
    33449.28010048, 29545.51192782, 29728.1331936 , 31327.3181988 ,
    26356.96060002, 28045.40867321, 28733.31492368, 23405.1683291 ,
    26052.70858327, 23672.22630383, 22620.23125773, 25574.74171866,
    25370.30815361, 23985.10323843, 22192.45381967, 22989.33499691,
    23396.7861397 , 20921.87487021, 20269.18172291, 22365.32590395,
    20722.44085298, 19614.89124674, 22271.35394637, 18471.28640201,
    21553.99947469, 20063.25389699, 19631.78607125, 21609.72541978,
    20773.84137289, 19674.27911469
])

rt_max = 2*x[-1]
fd0 = 19896.45379

def distance(cd, rc, rt, fd):
    if rc > rt:
        return -1e10
    kdens = cd * (
        (1. / np.sqrt(1. + (x / rc) ** 2)) - (1. / np.sqrt(1. + (rt / rc) ** 2))
    ) ** 2 + fd
    model = np.where(x < rt, kdens, fd)
    # Return negative sum of squared diffs
    return -np.sum((y - model) ** 2)

pbounds = {
    "cd": [fd0, 10 * max(y)],
    "rc": [x[0], rt_max],
    "rt": [x[0], rt_max],
    "fd": [fd0 * .1, max(y)],
}

optimizer = BayesianOptimization(
    f=distance,
    pbounds=pbounds,
    random_state=1,
)

optimizer.maximize(n_iter=50, init_points=100,)

model = {}
for k, v in optimizer.max['params'].items():
    print(f"{k}: {v:.3f}")
    model[k] = v
lkl = optimizer.max['target']
print(f"Lkl: {lkl}")

Expected behavior The largest value should be returned?

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context Add any other context about the problem here.

till-m commented 3 months ago

Hi @Gabriel-p,

that's the maximum value of the targets explored by the optimizer. What makes you think it wouldn't be?

Gabriel-p commented 3 months ago

Hi @till-m, that's the minimum value. The maximum value is close to zero unless I'm understanding something wrong...

till-m commented 3 months ago

Hi @Gabriel-p,

I think I figured out the problem -- you're saying it's close to zero since the target column in the output reads -1.592e+0? It turns out that the exponent here is getting cut off, presumably due to - sign in front, which is apparently not accounted for in the number formatting logic. This is definitely a bug and should be fixed. I have changed your code to increase the logging size, so you can see the correct values.

import numpy as np
from bayes_opt import BayesianOptimization
from bayes_opt.logger import ScreenLogger
from bayes_opt.event import Events

x = np.array([
    0.00869858, 0.01304788, 0.02174646, 0.03044504, 0.03914363,
    0.04784221, 0.05654079, 0.06523938, 0.07393796, 0.08263655,
    0.09133513, 0.10003371, 0.1087323 , 0.11743088, 0.12612946,
    0.13482805, 0.14352663, 0.15222522, 0.1609238 , 0.16962238,
    0.17832097, 0.18701955, 0.19571813, 0.20441672, 0.2131153 ,
    0.22181388, 0.23051247, 0.23921105, 0.24790964, 0.25660822,
    0.2653068 , 0.27400539, 0.28270397, 0.29140255, 0.30010114,
    0.30879972, 0.31749831, 0.32619689, 0.33489547, 0.34359406,
    0.35229264, 0.36099122, 0.36968981, 0.37838839, 0.38708698,
    0.39578556, 0.40448414, 0.41318273, 0.42188131, 0.43057989
])
y = np.array([
    54688.54691276, 56090.81734642, 49640.37335158, 54688.54691276,
    50481.73561177, 59660.23299573, 54364.94604345, 54408.09282602,
    50729.19510007, 52695.84682282, 55089.19560809, 52859.49852102,
    49472.10089954, 43470.38344347, 40907.61334058, 41525.29864839,
    37478.86431783, 34255.46345085, 37406.51129791, 34948.89388507,
    33449.28010048, 29545.51192782, 29728.1331936 , 31327.3181988 ,
    26356.96060002, 28045.40867321, 28733.31492368, 23405.1683291 ,
    26052.70858327, 23672.22630383, 22620.23125773, 25574.74171866,
    25370.30815361, 23985.10323843, 22192.45381967, 22989.33499691,
    23396.7861397 , 20921.87487021, 20269.18172291, 22365.32590395,
    20722.44085298, 19614.89124674, 22271.35394637, 18471.28640201,
    21553.99947469, 20063.25389699, 19631.78607125, 21609.72541978,
    20773.84137289, 19674.27911469
])

rt_max = 2*x[-1]
fd0 = 19896.45379

def distance(cd, rc, rt, fd):
    if rc > rt:
        return -1e10
    kdens = cd * (
        (1. / np.sqrt(1. + (x / rc) ** 2)) - (1. / np.sqrt(1. + (rt / rc) ** 2))
    ) ** 2 + fd
    model = np.where(x < rt, kdens, fd)
    # Return negative sum of squared diffs
    return -np.sum((y - model) ** 2)

pbounds = {
    "cd": [fd0, 10 * max(y)],
    "rc": [x[0], rt_max],
    "rt": [x[0], rt_max],
    "fd": [fd0 * .1, max(y)],
}

optimizer = BayesianOptimization(
    f=distance,
    pbounds=pbounds,
    random_state=1,
)

logger = ScreenLogger()
logger._default_cell_size = 15

# the following needs to happen before any call to `.maximize`
# otherwise there will be two loggers associated with the optimizer.
for e in [Events.OPTIMIZATION_START, Events.OPTIMIZATION_STEP, Events.OPTIMIZATION_END]:
    optimizer.subscribe(e, logger)

optimizer.maximize(n_iter=1, init_points=20,) # modified these since the optimizer doesn't find anything after

model = {}
for k, v in optimizer.max['params'].items():
    print(f"{k}: {v:.3f}")
    model[k] = v
lkl = optimizer.max['target']
print(f"Lkl: {lkl}")

The reason why I said it was the maximum value is because I checked the target function values registered to the target space. I didn't realize you were making this comment based on the log, hence the question why you're thinking it's not the maximum.

Let me know if this helps.

Cheers, Till

Gabriel-p commented 3 months ago

Thank you Till, that solves it