I follow the tutorial to implement the warmup_scheduler, but the learning rate I get from the get_lasr_lr() of the torch.optim.lr_scheduler.MultiStepLR is the same as the initial learning rate. How should I get the learning rate after the warmup process?
I follow the tutorial to implement the warmup_scheduler, but the learning rate I get from the get_lasr_lr() of the torch.optim.lr_scheduler.MultiStepLR is the same as the initial learning rate. How should I get the learning rate after the warmup process?