nabenabe0928 / reading-list

1 stars 0 forks source link

Model-based Asynchronous Hyperparameter and Neural Architecture Search #63

Closed nabenabe0928 closed 1 year ago

nabenabe0928 commented 1 year ago

Model-based Asynchronous Hyperparameter and Neural Architecture Search

ABOHB paper

Main points

Fantasizing

Fantasizing is to simply marginalize the unobserved objective values. Let a set of observations be:

\\\{(x_n, y_n)\\\}_{n=1}^N

and a set of pending observations be:

\\\{(x_m, y_m)\\\}_{m=1}^M.

Furthermore, let a utility function for an acquisition function be $u(x|\cdot)$. Then the fantasizing is to compute the following marginalized acquisition function:

\int u(x|\\{(x_n, y_n)\\}_{n=1}^N, \\{(x_m, y_m)\\}_{m=1}^{M} )
p(\\{y_m\\}_{m=1}^M | \\{x_m\\}_{m=1}^M, \\{(x_n, y_n)\\}_{n=1}^N)dy_1 dy_2 \dots dy_M
nabenabe0928 commented 1 year ago

Experiments

Performance over time with 4, 8 workers

  1. A-BOHB stopping
  2. A-BOHB promotion
  3. A-HB stopping
  4. A-HB promotion

Targets are a bunch of self-made benchmarks and NB201.

Stopping is based on Median pruning and promotion is based on ASHA. Apparently, stopping is better. After than, mostly 8 workers.

Stopping is based on Median pruning and promotion is based on ASHA. Apparently, stopping is better.

Scalability test

$n \in \{1,2,4,8,16\}$