imbue-ai / carbs

Cost aware hyperparameter tuning algorithm
MIT License
109 stars 9 forks source link

misc comments #3

Closed ad8e closed 3 months ago

ad8e commented 3 months ago

I like that I can specify a single center point rather than specifying bounds. The user often has no good idea of the bound, but a center guess is far more reasonable. I can set a scale guess by multiplying the input space by a scale; maybe that's helpful? (For example, I know that LR has a tighter scale than init, and per-layer LRs are less impactful than global LRs.)

The readme could link your paper: https://arxiv.org/abs/2306.08055

In the readme, notebooks/carbs_demo.sync.ipynb probably should be notebooks/carbs_demo.ipynb.

I'm not sure how to use is_suggestion_remembered. In my current setup, I pickle my hyperoptimizer to a file, and load+resave it whenever I make a suggestion or observation. This is necessary because it uses a quasirandom sequence, so that parallel suggestions need to increment the optimizer state. Should this flag be used in a similar way? The consequence of "whether to store the suggestion, which be fit with a surrogate value until it is observed" is unclear. It seems odd to fit an arbitrary value as the source of truth to control future sampling locations. And what if my run crashes, and a wrong surrogate value is left inside?

You appear to be sampling from torch's Normal distribution. A low-discrepancy sequence could perform a faster search. Vizier uses Halton: https://github.com/google/vizier/blob/main/vizier/_src/algorithms/designers/quasi_random.py HEBO uses Sobol: https://github.com/huawei-noah/HEBO/blob/bd1461c752883ca0fe79daa4cf107d9db3160ba2/HEBO/hebo/optimizers/hebo.py#L49

abefetterman commented 3 months ago

Thanks! I made some updates to the README.