TPESampler significantly slows down for high-dimensional objectives and I checked where the slowdown comes from.
According to my check, _get_internal_repr in TPESampler and ndtr-related functions in _truncnorm.py were the causes.
As the Optuna design prefers the stateless nature of each sampler, I think it is hard to enhance _get_internal_repr while it is possible to enhance ndtr.
In fact, I could solve the following problem three times quicker with an approximation algorithm for standard normal distribution:
import time
import optuna
def objective(trial: optuna.Trial) -> float:
if (trial.number + 1) % 50 == 0:
print(trial.number + 1)
return sum(trial.suggest_float(f"x{i}", -5, 5) ** 2 for i in range(10))
if __name__ == "__main__":
optuna.logging.set_verbosity(optuna.logging.CRITICAL)
sampler = optuna.samplers.TPESampler(seed=42)
study = optuna.create_study(sampler=sampler)
start = time.time()
study.optimize(objective, n_trials=1000)
print(time.time() - start, study.best_trial)
Description
By adding an option to use an approximation algorithm, we can speed up TPESampler routine.
There is a paper that discusses the approximation quality of standard normal distribution, c.f. APPROXIMATING THE CUMULATIVE DISTRIBUTION FUNCTION OF THE NORMAL DISTRIBUTION.
I used the standard logistic function, i.e. Eq. (13) in the paper, due to its invertibility necessary for ndtri_exp (however, we can also use different approximation methods for different functions).
Note that invertibility is supported by Eqs. (1), (4), (5), (7), (8), and (13).
For example, _ndtr is already vectorized, so we do not have to add an approximation algorithm for it and we can use better approximation such as Byrc (2001B) in Eq. (12) or Zelen and Severo (1964) in Eq. (2).
Motivation
TPESampler
significantly slows down for high-dimensional objectives and I checked where the slowdown comes from.According to my check,
_get_internal_repr
inTPESampler
andndtr
-related functions in_truncnorm.py
were the causes.As the Optuna design prefers the stateless nature of each sampler, I think it is hard to enhance
_get_internal_repr
while it is possible to enhancendtr
.In fact, I could solve the following problem three times quicker with an approximation algorithm for standard normal distribution:
Description
By adding an option to use an approximation algorithm, we can speed up
TPESampler
routine.There is a paper that discusses the approximation quality of standard normal distribution, c.f. APPROXIMATING THE CUMULATIVE DISTRIBUTION FUNCTION OF THE NORMAL DISTRIBUTION.
I used the standard logistic function, i.e. Eq. (13) in the paper, due to its invertibility necessary for
ndtri_exp
(however, we can also use different approximation methods for different functions). Note that invertibility is supported by Eqs. (1), (4), (5), (7), (8), and (13). For example,_ndtr
is already vectorized, so we do not have to add an approximation algorithm for it and we can use better approximation such as Byrc (2001B) in Eq. (12) or Zelen and Severo (1964) in Eq. (2).Alternatives (optional)
Another option would be to add an option to use only the recent
K
trials for the surrogate training.