Clarification is needed in the chapter "How should Adam’s hyperparameters be tuned?"

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

Other

26.29k stars 2.18k forks source link

Clarification is needed in the chapter "How should Adam’s hyperparameters be tuned?" #71

Closed 21kc-caracol closed 2 months ago

21kc-caracol commented 2 months ago

Screenshot_20240626-063650

Please clarify if For a budget of 10-25:

First tune the learning rate, then beta1.
(Or) Create a search space for both parameters, then run experiments to find the best combination. An example will be appreciated.

I understood that its option 1. First tune for best learning rate, then fix that value, and start tuning beta1.

laurahsisson commented 2 months ago

It’s the second option. If you only have a limited number of trials, you can focus on just tuning the learning rate, but if you have more compute/time, you can optimize beta1 and LR in conjunction.

If you change beta1 (or beta2 etc), you’ll need to retune LR.