lessw2020 / Ranger-Deep-Learning-Optimizer

Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase
Apache License 2.0
1.19k stars 176 forks source link

Add manual synchronization function #24

Open qbx2 opened 4 years ago

qbx2 commented 4 years ago

Hello. First of all, thank you for sharing code and experiment results. Reading the code, I found that the model will use fast weights to infer. According to LookAhead, fast weights (before synchronization) may perform worse than slow weights. By chance of (1-1/k) probability (80% when k=5), we will use unsynchronized fast weights to validate/test. Therefore, it should be better if we manually synchronize before evaluation.

lessw2020 commented 4 years ago

Hi @qbx2 , Thanks for the feedback! I think adding a manual sync is a good idea, along with a manual clear weights (for restarting new training). I will try and add that later this week! Thanks Less

qbx2 commented 4 years ago

Another minor issue: There is a typo N_sma_threshhold in ranger.py. It should be N_sma_threshold.