secondmind-labs / trieste

A Bayesian optimization toolbox built on TensorFlow
Apache License 2.0
217 stars 42 forks source link

OOM error with more than 4 objectives #735

Closed ducvinh-nguyen closed 1 year ago

ducvinh-nguyen commented 1 year ago

When trying to do the bayesian optimization with 7 input and 5 objectives using ExpectedHypervolumeImprovement, I encounter the out-of-memory OOM problem:

ResourceExhaustedError: Graph execution error:
Detected at node 'GatherV2' defined at (most recent call last):
...
OOM when allocating tensor with shape[7000,1616,32,5,5] and type double on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
     [[{{node GatherV2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
 [Op:__inference___call___2155695]

I have 32GB RAM, that I think more than enough for the problem. Is there a way I can overcome this problem ? Or is it the limitation of the implementation ? Thank you.

Below the code to reproduce the problem :

import numpy as np
import tensorflow as tf
import trieste
from trieste.space import Box, SearchSpace
from trieste.data import Dataset
from trieste.models.gpflow import build_gpr, GaussianProcessRegression
from trieste.models import TrainableModelStack
from trieste.acquisition.function import ExpectedHypervolumeImprovement
from trieste.acquisition.rule import EfficientGlobalOptimization
from trieste.ask_tell_optimization import AskTellOptimizer
from scipy.stats import qmc

n_input = 7
n_objective = 5
nb_initial_points = 10

# Synthetic dataset
X = qmc.Halton(d=n_input, seed=0).random(n=nb_initial_points)
Y = np.random.default_rng(seed=0).random(size=(X.shape[0] , n_objective))
initial_data = Dataset(X, Y)

search_space = trieste.space.Box([0] * n_input, [1] * n_input)

def build_stacked_independent_objectives_model(
    data: Dataset, num_output: int, search_space: SearchSpace
) -> TrainableModelStack:
    gprs = []
    for idx in range(num_output):
        single_obj_data = Dataset(
            data.query_points, tf.gather(data.observations, [idx], axis=1)
        )
        gpr = build_gpr(single_obj_data, search_space, likelihood_variance=1e-7)
        gprs.append((GaussianProcessRegression(gpr), 1))

    return TrainableModelStack(*gprs)

model = build_stacked_independent_objectives_model(
    initial_data, n_objective, search_space
)

# Acquisition function
ehvi = ExpectedHypervolumeImprovement()
rule: EfficientGlobalOptimization = EfficientGlobalOptimization(builder=ehvi)

ask_tell = AskTellOptimizer(search_space, initial_data, model, acquisition_rule=rule)

next_query = ask_tell.ask().numpy()
print(next_query)
ducvinh-nguyen commented 1 year ago

"Hypervolume improvement is notorious for scaling very badly with number of objectives, its really meant to be used only with 2 objectives, or at most 3. Hence, its not really solvable, its a fundamental limitation of the algorithm."