This pull request separates out the logic for learning rate schedulers, and implements a modified "newbob" and constant rate scheduler. The original scheduler in tf_train.py is now called "halvsies" and is the default.
New command line options are -lrscheduler [halvsies,newbob,constantlr] and -lr_spec [string]. The lr_spec is a human readable set of options, separated by commas, and depends on the scheduler.
Note that some of these duplicate current command line options. Specifying the lr_spec will override command line options.
newbob is modified from the original algorithm to allow for a fixed burn in period (controlled by half_after) variable thresholds (original algorithm sets these to 0.5) and specifying the halving rate (default is new_lr = lr * 0.5 but the 0.5 can be adjusted by half_rate)
Restarts: works as previously for halvsies, and as you would expect for constantlr. For newbob the algorithm tries to guess which phase (burn in, constant, or ramping) the algorithm is in and then start from that point. NB: it will respect lr_rate on the command line so if in ramping mode you need to make sure to set lr_rate correctly.
This pull request separates out the logic for learning rate schedulers, and implements a modified "newbob" and constant rate scheduler. The original scheduler in tf_train.py is now called "halvsies" and is the default.
New command line options are
-lrscheduler [halvsies,newbob,constantlr]
and-lr_spec [string]
. The lr_spec is a human readable set of options, separated by commas, and depends on the scheduler.Note that some of these duplicate current command line options. Specifying the lr_spec will override command line options.
newbob
is modified from the original algorithm to allow for a fixed burn in period (controlled byhalf_after
) variable thresholds (original algorithm sets these to 0.5) and specifying the halving rate (default isnew_lr = lr * 0.5
but the 0.5 can be adjusted byhalf_rate
)Restarts: works as previously for
halvsies
, and as you would expect forconstantlr
. Fornewbob
the algorithm tries to guess which phase (burn in, constant, or ramping) the algorithm is in and then start from that point. NB: it will respectlr_rate
on the command line so if in ramping mode you need to make sure to setlr_rate
correctly.