google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.
Other
26.29k stars 2.18k forks source link

Utilize `Hyperband`/`ASHA` scheduler? (Willing to PR) #27

Open fzyzcjy opened 1 year ago

fzyzcjy commented 1 year ago

Hi thanks for the playbook! I see some articles showing how Hyperband or ASHA can be used to boost the speed of hyperparameter searching. Shortly speaking, it is:

On a high level, ASHA terminates trials that are less promising and allocates more time and resources to more promising trials. As our optimization process becomes more efficient, we can afford to increase the search space by 5x, by adjusting the parameter num_samples. src

Thus, I wonder whether it is a good idea to utilize this scheduler (in addition to quasi-random search or bayesian)? IMHO they are somehow orthogonal, and we can use both ASHA and quasi-random search. Then, quasi-random search will propose (quasi) random hyper-parameters while ASHA will throw away some of the bad ones.

I am willing to contribute (e.g. making a PR)!

fzyzcjy commented 5 months ago

It has been more than a year, and it seems that other issues mostly are closed and have discussions, thus I guess this issue may be simply forgotten. Thus a small ping - @varungodbole @georgedahl

georgedahl commented 3 months ago

Yes, Hyperband can be a useful technique. And ASHA potentially as well.

If you want to send a PR, that would be welcome. Try to make it a relatively small change in terms of diff if you can. Perhaps a sentence or phrase could be added when we talk about using quasi-random search to say that we can use techniques of this type to quickly discard points that aren't promising from the search.

Note we currently only briefly link to the hyperband paper in a footnote.

fzyzcjy commented 3 months ago

Thank you! I will do that later.