Closed lscheinkman closed 3 years ago
This lr-range test is based on bert_sparse_100k_kd_lr_range_test and adapted for deepspeed training. With this test the best max_lr value found is 0.0017
bert_sparse_100k_kd_lr_range_test
max_lr
0.0017
This lr-range test is based on
bert_sparse_100k_kd_lr_range_test
and adapted for deepspeed training. With this test the bestmax_lr
value found is0.0017