Open MohitLamba94 opened 3 years ago
Thankyou for this wonderful benchmarking.
In several experiments wd=1.2e-6. Can you please give some guidelines or rule of thumb in choosing the hyperparameter for weight decay?
wd=1.2e-6
@MohitLamba94
Any update?
@MohitLamba94 Any update?
Sorry. I did not look into into any further.
Thankyou for this wonderful benchmarking.
In several experiments
wd=1.2e-6
. Can you please give some guidelines or rule of thumb in choosing the hyperparameter for weight decay?