This pull request adds two new classes: LinearWarmupCosineAnnealingLR and AdamW optimizer. These classes are used for training and optimization in the codebase. The LinearWarmupCosineAnnealingLR class implements a linear warmup followed by cosine annealing learning rate scheduling.
These additions improve the training process and allow for better optimization of the model.
This pull request adds two new classes:
LinearWarmupCosineAnnealingLR
andAdamW
optimizer. These classes are used for training and optimization in the codebase. TheLinearWarmupCosineAnnealingLR
class implements a linear warmup followed by cosine annealing learning rate scheduling.These additions improve the training process and allow for better optimization of the model.
Closes #63