About the Hyperparameter delta

NUS-HPC-AI-Lab / InfoBatch

Lossless Training Speed Up by Unbiased Dynamic Data Pruning

318 stars 18 forks source link

About the Hyperparameter delta #17

Closed CN-Wenbo closed 9 months ago

CN-Wenbo commented 9 months ago

We noticed that delta is set to 0.875 in example code. Is this hyperparameter used in paper's experiments, which means full dataset will be used in final epochs. Besides, in Appendix A "due to reduced steps, InfoBatch uses a learning rate of 0.05 in this setting", it seems unfair to increase lr.

henryqin1997 commented 9 months ago

As stated in section 3, delta is the hyperparameter to control the annealing to avoid remaining bias. Comparison is based on the total saving ratio (with annealing considered).

In most cases, InfoBatch does not need to tune the parameter. This is the only case using different lr as stated, and we guess it is because R-18's loss surface is not as smooth as R-50.