Implement LR schedules in the PyTorch training code. Add MambaLayer and make it configurable through parameter config files and for HPO.
Also contains the following changes.
--load now takes as input the path to a saved checkpoint from which to start a new training. I.e., a training starts from epoch 1, with LR schedules starting from the beginning. The only thing that is loaded is the pre-trained model weights from the checkpoint.
--resume-training is a new command line arg that takes as input a path to a training directory containing an unfinished training and attempts to restore it and continue the training from the last saved checkpoint.
Learning rate versus training step of a training using the cosinedecay LR schedule and was interrupted halfway through and then continued.
Learning rate versus training step of a training using the onecycle LR schedule and was interrupted halfway through and then continued.
Implement LR schedules in the PyTorch training code. Add MambaLayer and make it configurable through parameter config files and for HPO.
Also contains the following changes.
--load
now takes as input the path to a saved checkpoint from which to start a new training. I.e., a training starts from epoch 1, with LR schedules starting from the beginning. The only thing that is loaded is the pre-trained model weights from the checkpoint.--resume-training
is a new command line arg that takes as input a path to a training directory containing an unfinished training and attempts to restore it and continue the training from the last saved checkpoint.Learning rate versus training step of a training using the
cosinedecay
LR schedule and was interrupted halfway through and then continued.Learning rate versus training step of a training using the
onecycle
LR schedule and was interrupted halfway through and then continued.